[ODE] Some SSE in Quick step

Frederic Marmond fmarmond at eprocess.fr
Tue May 25 14:39:27 MST 2004


Well, i didn't read the whole thread, but i think that D3D is OUT OF 
SUBJECT here, as it is not cross-platform.
One can put conditionnal compilation (#if ) to include optimized 
assembly for particular architectures (SSE, 3dnow, ...) that can be 
portable between OS (ODE runs on many other OS than windows), and 
between hard architectures (the 'assembly' (or what you want else) code 
will be compiled only on the matching architecture).

Fred
ps: Of course, it is always a good idea to compile with at least -O2, 
clean your loops, avoid unnecessary initialisations, ...

devicezero wrote:

> IMHO, D3D Matrix stuff is faster becouse is very good c++ coded. And 
> all D3D math part are good and faster.
>
> Also, miracles can come if you compile with /Ox which mean FULL 
> OPTIMIZITATION.
> Premature code optimizations don't have any sense some times, becouse 
> can cut off the global optimizations.
>
>
>
> DirectX (D3D) have D3DXMATRIXA16, it's public from D3DXMATRIX.
>
> typedef D3DX_ALIGN16 _D3DXMATRIXA16 D3DXMATRIXA16
>
> where D3DX_ALIGN16
>
> #if _MSC_VER >= 1300  // VC7
> #define D3DX_ALIGN16 __declspec(align(16))
> #else
> #define D3DX_ALIGN16  // Earlier compiler may not understand this, do 
> nothing.
> #endif
>
>
> and from headers:
>
> //--------------------------------------------------------------------------- 
>
> // Aligned Matrices
> //
> // This class helps keep matrices 16-byte aligned as preferred by P4 
> cpus.
> // It aligns matrices on the stack and on the heap or in global scope.
> // It does this using __declspec(align(16)) which works on VC7 and on 
> VC 6
> // with the processor pack. Unfortunately there is no way to detect the
> // latter so this is turned on only on VC7. On other compilers this is 
> the
> // the same as D3DXMATRIX.
> //
> // Using this class on a compiler that does not actually do the alignment
> // can be dangerous since it will not expose bugs that ignore alignment.
> // E.g if an object of this class in inside a struct or class, and 
> some code
> // memcopys data in it assuming tight packing. This could break on a 
> compiler
> // that eventually start aligning the matrix.
> //--------------------------------------------------------------------------- 
>
> #ifdef __cplusplus
> typedef struct _D3DXMATRIXA16 : public D3DXMATRIX
> {
>    _D3DXMATRIXA16() {}
>    _D3DXMATRIXA16( CONST FLOAT * );
>    _D3DXMATRIXA16( CONST D3DMATRIX& );
>    _D3DXMATRIXA16( CONST D3DXFLOAT16 * );
>    _D3DXMATRIXA16( FLOAT _11, FLOAT _12, FLOAT _13, FLOAT _14,
>                    FLOAT _21, FLOAT _22, FLOAT _23, FLOAT _24,
>                    FLOAT _31, FLOAT _32, FLOAT _33, FLOAT _34,
>                    FLOAT _41, FLOAT _42, FLOAT _43, FLOAT _44 );
>
>    // new operators
>    void* operator new   ( size_t );
>    void* operator new[] ( size_t );
>
>    // delete operators
>    void operator delete   ( void* );   // These are NOT virtual; Do not
>    void operator delete[] ( void* );   // cast to D3DXMATRIX and delete.
>      // assignment operators
>    _D3DXMATRIXA16& operator = ( CONST D3DXMATRIX& );
>
> } _D3DXMATRIXA16;
>
> #else //!__cplusplus
> typedef D3DXMATRIX  _D3DXMATRIXA16;
> #endif //!__cplusplus
>
> _______________________________________________
> ODE mailing list
> ODE at q12.org
> http://q12.org/mailman/listinfo/ode
>



More information about the ODE mailing list