| 84a01de5 | 12-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* In
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* Increase use of AVX commands for edge cases
* Prep for eventual Tensor Object
* Comment updates
* Readme update
* Update README
* Refactor to reduce code
* Increase vectorization in remainder of columns
* Vectorize column remainder on C=1 case
* Switch to static inlining for AVX tensor contract
* Tidying for merge
* make style
* Style cleanup
* Full register use for columns
* Make style
show more ...
|