* simplified framebuffer management
* convert texture to rgba32
* remove float copy/blit
* adding SIMD framebuffer read/write
This adds SIMD framebuffer read/write, and also texel fetch.
This supports SSE2/SSE41 and SISD fallback (also includes ARM NEON support, conceptually identical but still needs testing).
* consistency
* tweaks
* review of `sw_texture_sample_linear`
* better quad sorting
unrelated to the PR, but at least it's done
* ignore some pipeline state in certains context
* convention tweaks
---------
Co-authored-by: Ray <raysan5@gmail.com>