I'm guessing by 'renders twice' you mean the CPU is sending the image to GPU to draw on screen twice. If that's the case then doing it with a texture would be way faster because it would actually get accelerated by the GPU.
Can't say exactly how it look in C, but in HLSL for DX8, pixel shader 1.1 [that's geforce 3 and up]
video echo  assume that the full screen quad has 4 vertices, with xy values of [1,1] [1,1] [1,1] [1,1]
vertex shader
struct VS_OUTPUT
{
float4 pos : POSITION0;
float2 texCoord : TEXCOORD0;
float2 texCoordB : TEXCOORD1;
};
VS_OUTPUT vs_main( float4 inPos: POSITION )
{
VS_OUTPUT Out = (VS_OUTPUT) 0;
Out.pos = float4( inPos.xy, 0.0f, 1.0f);
// get into range [0,1]
Out.texCoord = (float2(Out.pos.x,Out.pos.y)+1.0f)/2.0f;
Out.texCoordB = float2(1Out.texCoord.x,Out.texCoord.y);
return Out;
}
This gives a quad adjusted to the exact screen extents to the pixel shader, and two sets of texture coordinates, where the second is flipped on the x axis.
pixel shader
float4 ps_main( float2 texCoord : TEXCOORD0, float2 texCoordB : TEXCOORD1 ) : COLOR
{
float4 result;
float4 A = tex2D( Texture0, texCoord );
float4 B = tex2D( Texture0, texCoordB );
result = (A + B)/2;
return result;
}
Texture0 is of course assumed to be the texture sampler for the milkdrop processed image.
as for gamma, this is really just multiplication, so we should be able to add
result *= gamma; before the return
except in dx8 we can't multiply by values over 1, we can get around this be adding the result to itself several times and then multiplying by gamma. This of course means that if we add enough times for 4x brightness, that a user set gamma value of "1.0" actually correspondes to 0.25 whem we multiply by "gamma" as our result begins with a value of 400%. Hope that made sense >^~^<
Eo.S. >^^<
