Sitara Device Crypto Performance Comparison

From Texas Instruments Wiki
Jump to: navigation, search


Under Construction

Device Comparison

This page will compare the cryptographic performance of different Sitara devices.  Devices which have cryptographic accelerators available will be tested with and without acceleration to get a feel for how the acceleration improves performance over software-only cryptography.


Table of devices


Device HW acceleration
AM18x none
AM35x AES, DES, 3DES, SHA, MD5
AM37x (BeagleBoard) AES, DES, 3DES, SHA, MD5
AM335x (BeagleBone) AES, SHA, MD5



Cryptography in the Sitara SDK

All Sitara SDK's include OpenSSL.  OpenSSL is a pure software implementation of general cryptographic functions.  The specific version of OpenSSL in each SDK may vary, but all versions are capable of general performance measurements.  So for Sitara devices with no HW acceleration, OpenSSL is used to measure software-only crypto performance.  In the SDK's for devices with HW acceleration there is an additional Linux driver for the crypto modules.  There is also a driver for Open Cryptographic Framework (OCF).  The OCF driver is an open source general abstraction layer for user level applications (like OpenSSL) to access available HW crypto acceleration modules.  OCF includes a test application which can be used to directly measure crypto performance at the OCF level. 



How the numbers are generated


Since OpenSSL is already included in the SDK it is easy to simply run the included speed test for individual algorithms.  The examples below show the format of the command.  Typing just "openssl speed" will run the speed test on every available algorithm.  This can be time consuming and unnecessary.  Specifying the name of algorithm will run the test for just that algorithm.  Entering an invalid algorithm will cause OpenSSL to list the available algorithms.  In the example below, the speed test is executed with an invalild algorithm.  This lists the algorithms and then the speed test is executed for aes-256-cbc.  Results of that test are shown below. 

root@am335x-evm:~# openssl speed sdkjfh
Error: bad option or value
Available values:
mdc2 md4 md5 hmac sha1 sha256 sha512 whirlpoolrmd160
idea-cbc seed-cbc rc2-cbc bf-cbc
des-cbc des-ede3 aes-128-cbc aes-192-cbc aes-256-cbc aes-128-ige aes-192-ige aes-256-ige
camellia-128-cbc camellia-192-cbc camellia-256-cbc rc4
rsa512 rsa1024 rsa2048 rsa4096
dsa512 dsa1024 dsa2048
ecdsap160 ecdsap192 ecdsap224 ecdsap256 ecdsap384 ecdsap521
ecdsak163 ecdsak233 ecdsak283 ecdsak409 ecdsak571
ecdsab163 ecdsab233 ecdsab283 ecdsab409 ecdsab571
ecdsa
ecdhp160 ecdhp192 ecdhp224 ecdhp256 ecdhp384 ecdhp521
ecdhk163 ecdhk233 ecdhk283 ecdhk409 ecdhk571
ecdhb163 ecdhb233 ecdhb283 ecdhb409 ecdhb571
ecdh
idea seed rc2 des aes camellia rsa blowfish

Available options:
-engine e use engine e, possibly a hardware device.
-evp e use EVP e.
-decrypt time decryption instead of encryption (only EVP).
-mr produce machine readable output.
-multi n run n benchmarks in parallel.
Command exited with non-zero status 1

root@am335x-evm:~# time -v openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 	1399402 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 	 376353 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 	  96197 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 	  24205 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 	   3029 aes-256 cbc's in 3.00s
OpenSSL 1.0.0d 8 Feb 2011
built on: Mon Mar 19 09:02:42 CDT 2012
options:bn(64,32) rc4(ptr,int) des(idx,risc1,2,long) aes(partial) idea(int) blowfish(idx)
compiler: arm-arago-linux-gnueabi-gcc -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp 
-mthumb-interwork -mno-thumb --sysroot=/home/hudson/amsdk-nightly-build-05.04.01.00/cortex-A8/arago-tmp/
sysroots/armv7a-arago-linux-gnueabi -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
-DL_ENDIAN -DTERMIO -fexpensive-optimizations -frename-registers -fomit-frame-pointer -O2 -ggdb2 -Wall 
-DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS
The 'numbers' are in 1000s of bytes per second processed.
type 		16 bytes 	64 bytes 	256 bytes 	1024 bytes 	8192 bytes
aes-256 cbc 	7463.48k 	8028.86k 	8208.81k 	8261.97k 	8271.19k
	Command being timed: "openssl speed aes-256-cbc"
	User time (seconds): 15.01
	System time (seconds): 0.02
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.05s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 6608
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 445
	Voluntary context switches: 11
	Involuntary context switches: 315
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
root@am335x-evm:~#


So the above test results show the OpenSSL speed test for aes256 cbc mode without crypto acceleration.  The "time -v" switch was added to the beginning of the command to provide some additional metrics above how this operation performed with regard to its CPU usage.  Note that the speed test without crypto acceleration occupied the CPU at 100%

Now the test is executed again but with additional parameters to the OpenSSL command to give it access to the crypto accelerators.


root@am335x-evm:~# time -v openssl speed -evp aes-256-cbc -engine cryptodev
engine "cryptodev" set.
Doing aes-256-cbc for 3s on 16 size blocks: 137551 aes-256-cbc's in 0.12s
Doing aes-256-cbc for 3s on 64 size blocks: 102837 aes-256-cbc's in 0.07s
Doing aes-256-cbc for 3s on 256 size blocks: 52428 aes-256-cbc's in 0.06s
Doing aes-256-cbc for 3s on 1024 size blocks: 17712 aes-256-cbc's in 0.04s
Doing aes-256-cbc for 3s on 8192 size blocks: 2460 aes-256-cbc's in 0.01s
OpenSSL 1.0.0d 8 Feb 2011
built on: Mon Mar 19 09:02:42 CDT 2012
options:bn(64,32) rc4(ptr,int) des(idx,risc1,2,long) aes(partial) idea(int) blowfish(idx)
compiler: arm-arago-linux-gnueabi-gcc -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -mthumb-interwork -mno-thumb --sysroot=/home/hudson/amsdk-nightly-build-05.04.01.00/cortex-A8/arago-tmp/sysroots/armv7a-arago-linux-gnueabi -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -fexpensive-optimizations -frename-registers -fomit-frame-pointer -O2 -ggdb2 -Wall -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 18340.13k 94022.40k 223692.80k 453427.20k 2015232.00k
Command being timed: "openssl speed -evp aes-256-cbc -engine cryptodev"
User time (seconds): 0.32
System time (seconds): 12.40
Percent of CPU this job got: 84%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.05s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 6592
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 444
Voluntary context switches: 11
Involuntary context switches: 313256
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
root@am335x-evm:~#





formatted text

-