I think tube tone is pretty easy to get. What emulators and solid state circuits almost never get right is the response characteristics of how tubes shift into distortion. There's several things that happen there. The sound starts sparkling clean. When the signal starts to break the limit of the tube's ability to reproduce the signal wave, there is an initial softening of the peaks, a soft-knee compression as it ramps into it's limit of reproducing the signal. Then when the signal is fully past the tube's ability to reproduce the sound wave, it's fully clipped, but with soft corners on the wave, not square, because of the response time of the tube, attracting electrons across several grids to a plate. This gives is a smooth satisfying kind of distortion that never increases in volume, because you are at the limit of the tube's ability to reproduce a signal. You can play lightly for sparkling cleans, and just slightly dig in and have it fully saturated like the afterburner of a jet, very little physical effort, and no knob or switch changes. The actual sound and response is more like if you split the signal, send half into a sparking clean side, and the other half into a saturated side with a kind of soft-knee ducking compression that switches between the two sides. If you play a power chord and let it ring, it should clean up by itself in the tail of the sound as the signal drops back below into the tube's ability to reproduce it cleanly. Typically emulators and solid state circuits remain distorted even when the signal drops down lower in the after ring of a chord. That's one noticeable part of the difference. The compression and level characteristics are a couple other differences.
The only devices I've used that did a pretty good job, without any tubes, was the Carl Martin PlexiTone and the RambleFX Marvel Drive pedals. There are other circuits that do it, I just haven't owned them.
With everything else (my SansAmp PSA-1, or other distortion/amp pedals or emulations), the way I get there is to add a compressor after the distortion that has threshold, knee, ratio and make up gain controls. I set the compressor to trigger about where the onset of distortion is, use a soft knee setting, and only apply enough ratio to keep the overall level consistent, but don't slam it like a leveler (it won't sound natural.) The make up gain is usually set to about half of what the ratio would calculate to (e.g. if distortion happens at -4db, the compressor kicks on at 2:1, that means -4db to 0db will be compressed to about -4db to -2db, so I'd add enough make up gain, 1db to 2db, to bring the resulting signal nearer to 0db at it's peak). That's just a starting point, you really gotta finalize it by ear and feel until it sounds and feels natural like a real amp.