Apparently, it does not scale properly with SLI.
As for TXAA and SLI scaling, TXAA actually forced the driver team to fix some SLI performance bugs, and those changes got into the first driver which provided TXAA support. The changes should help any game using a temporal effect. This includes CryEngine with their temporal AA effect, and anyone doing temporal feedback for screen-space ambient occlusion effects. That being said SLI scaling is a challenging problem, and likely you could still see problems with any number of other things, not related to TXAA.
TXAA requires a GPU to GPU copy with SLI which is proportional to the resolution of the framebuffer. This copy doesn't cause stalling problems with SLI because the data for the copy is generated at the time TXAA runs, and isn't needed until the start of the next time TXAA runs (which with SLI will be on another GPU). Now the copy itself can load up the PCIe bus which can limit scaling somewhat. On systems which have crappy PCIe buses, this will be a problem.
Even then, DICE didn't exactly implement MSAA correctly in BF3. The amount of stuff it doesn't touch is incredibly sad.
I wouldn't hammer on DICE for this, those guys are actually quite awesome. The problem you are seeing with MSAA not removing aliasing on trees and such is caused because MSAA aims to shade only one sample per pixel. The outline on tree leaves is generated in the shader ("alpha test") not from actual geometry (triangle edges are the only things which get multiple samples by default). This is a common problem in just about every game with MSAA.
There is one technique developers can use to increase the quality with "alpha test" which involves selectively super-sampling at critical parts of the engine. For instance one can render with super-sampling into depth only with "alpha-test" on to get the good outline via multiple samples, then do a second pass to shade without super-sampling to avoid the extra shading cost.
lazy developers do not implement proper AA
MLAA/FXAA/SMAA definitely didn't help here. But the real trend started with the popularity of deferred rendering and a mix of consoles not having enough perf to do deferred with AA, and the techniques to reduce the cost of deferred with MSAA still not being widely understood.
Actually yes, they did. I don't know what the underlying technical reason is, but it's considered to be a bad thing to have the drivers overriding the developer's intentions with LOD.
The problem with screwing with LOD is that you need to know which textures to apply this to. Biasing LOD on the wrong textures causes problems. The best thing here would be the developers building this bias option into their games, touching only the correct textures.
With this statement being in reference to TXAA having a similar cost to MSAA, I don't buy it.
I forget if I posted some numbers here, but 4xTXAA on a mid-range mobile GT650M at 1280x800 takes 1.3 ms/frame extra over just turning on 4xMSAA. On BF3's highly optimized MSAA implementation the cost of just turning on 4xMSAA is about 9.1 ms/frame (on the same resolution and same GPU). So yeah, most of the cost of TXAA is in just the MSAA part, and the cost of the MSAA part will greatly vary per title. Also there are some optimizations going into the fixed cost of the TXAA bit, which will hit later titles.
SSAA is already capable of delivering what TXAA does
The difference is that with TXAA the shading cost is the same as with MSAA. 4xMSAA might only have a 50% hit in a title, with 4xRGSSAA this is going to be around a 400% hit for the pixel shading part of the frame (which at high resolutions is the dominate cost).