Lmbench

From Texas Instruments Wiki
Jump to: navigation, search

About

lmbench is a suite of simple, portable, ANSI/C microbenchmarks for UNIX/POSIX. In general, it measures two key features: latency and bandwidth. lmbench is intended to give system developers insight into basic costs of key operations. Supports-

  • Bandwidth benchmarks
    • Cached file read
    • Memory copy (bcopy)
    • Memory read
    • Memory write
    • Pipe
    • TCP
  • Latency benchmarks
    • Context switching.
    • Networking: connection establishment, pipe, TCP, UDP, and RPC hot potato
    • File system creates and deletes.
    • Process creation.
    • Signal handling
    • System call overhead
    • Memory read latency
  • Miscellanious
    • Processor clock rate calculation

Visit lmbench web page for more information.


Source Download Location


Cross compiling

  • Cross compilation command - make CC=$(TOOL_CHAIN_PREFIX)-gcc
    • The TOOL_CHAIN_PREFIX corresponds to the tool chain in use. Set this based on your tool chain. Also make path to toolchain is exported as part of $PATH.


Test setup

  • EVM booted up with NFS configuration.


Execution with logs

1.BANDWIDTH MEASUREMENTS
----------------------------------------

bw_file_rd
--------------
bw_file_rd times the read of the specified file in 64KB blocks. Results are reported in megabytes read per second. 
The data is not accessed in the user program; the benchmark relies on the operating systems read interface to have actually moved the data. 

The size specification may end with ``k'' or ``m'' to mean kilobytes (* 1024) or megabytes (* 1024 * 1024). 

./bw_file_rd 7M open2close ../new.ppt
7.00 154.21

The above command benchmarks file read performance. 7M is the size to read, open2close means the performance measurement includes profiling open/close as well. 
Otherwise we can give ioonly as below-

[root@beagleboard arm-none-linux-gnueabi]# ./bw_file_rd 7M io_only ../new.ppt
7.00 155.41

The performance reported is in Megabytes/sec.

bw_mem
--------------
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M rd
1.00 244.56
bw_mem rd allocates the specified amount of memory, zeros it, and then times the reading of that memory.

[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M wr
1.00 432.77
allocates the specified amount of memory, zeros it, and then times the writing of that memory as a series of 4 byte integer stores and increments. 

[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M rdwr
1.00 208.38
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M cp
1.00 205.25
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M fwr
1.00 433.46
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M frd
1.00 235.77
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M fcp
1.00 192.34
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M bzero
1.00 430.71
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mem 1M bcopy
1.00 189.02

bw_mmap_rd
------------------
bw_mmap_rd creates a memory mapping to the file and then reads the mapping 
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mmap_rd 1M open2close ../new.ppt
1.00 185.85
[root@beagleboard arm-none-linux-gnueabi]# ./bw_mmap_rd 1M mmap_only ../new.ppt
1.00 237.99

bw_pipe
-------------------
bw_pipe creates a Unix pipe between two processes and moves 50MB through the pipe in 64KB chunks 
[root@beagleboard arm-none-linux-gnueabi]# ./bw_pipe
Pipe bandwidth: 119.08 MB/sec

bw_tcp
------------
bw_tcp is a client/server program that moves data over a TCP/IP socket. Nothing is done with the data on either side; the data is moved in 48KB chunks. 

[root@beagleboard arm-none-linux-gnueabi]# ./bw_tcp localhost
0.065536 81.71 MB/sec

bw_unix
-----------------
bw_unix streams mesaures performance of data sockets
[root@beagleboard arm-none-linux-gnueabi]# ./bw_unix
AF_UNIX sock stream bandwidth: 120.00 MB/sec

2.LATENCY MEASUREMENTS
-----------------------------------------
lat_cmd
------------
Measures command latency
[root@beagleboard arm-none-linux-gnueabi]# ./lat_cmd ls
lat_cmd: 599.5294 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_cmd ps
lat_cmd: 2624.3333 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_cmd cp ../usbtree.txt ../new.tx
t
lat_cmd: 4852.0000 microseconds

lat_connect
-----------------
Measures interprocess connection latencies. The benchmark times the creation and connection of an AF_INET (aka TCP/IP) socket to a server.
[root@beagleboard arm-none-linux-gnueabi]# ./lat_connect localhost
TCP/IP connection cost to localhost: 92.2975 microseconds
lat_ctx
----------------------
Measures context switching time for any reasonable number of processes of any reasonable size.

The format is multi line, the first line is a title that specifies the size and non-context switching overhead of the test. Each subsequent line is a pair of numbers that indicates the number of processes and the cost of a context switch
[root@beagleboard arm-none-linux-gnueabi]# ./lat_ctx -s 128K processes 2

"size=128k ovr=191.70
2 118.50
[root@beagleboard arm-none-linux-gnueabi]# ./lat_ctx -s 128K processes 4

"size=128k ovr=234.81
4 400.60


[root@beagleboard arm-none-linux-gnueabi]# ./lat_dram_page -M 1M
60.517793

lat_fcntl
-------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_fcntl
Fcntl lock latency: 6.7451 microseconds

lat_fs
-------------
lat_fs is a program that creates a number of small files in the current working directory and then removes the files. Both the creation and removal of the files is timed. 

[root@beagleboard arm-none-linux-gnueabi]# ./lat_fs
0k      67      4968    16342
1k      50      4016    9992
4k      42      3792    10021
10k     30      2770    7453


The results are in terms of creates per second and deletes per second as a function of file size.

lat_mem_rd
----------------------

[root@beagleboard 
measures memory read latency for varying memory sizes and strides. The results are reported in nanoseconds per load.

[arm-none-linux-gnueabi]# ./lat_mem_rd 1M
"stride=128
0.00049 6.276
0.00098 6.597
0.00195 6.194
0.00293 7.013
0.00391 6.202
0.00586 6.280
0.00781 6.314
0.01172 6.192
0.01562 6.313
0.02344 38.590
0.03125 46.596
0.04688 52.612
0.06250 54.662
0.09375 57.368
0.12500 63.414
0.18750 78.652
0.25000 104.024
0.37500 178.904
0.50000 226.348
0.75000 252.026
1.00000 259.944

lat_mmap
---------------
times how fast a mapping can be made and unmade
[root@beagleboard arm-none-linux-gnueabi]# ./lat_mmap 1M ../new.ppt
1.000000 99

Result-Megabytes, usecs

lat_ops
---------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_ops
integer bit: 2.06 nanoseconds
integer add: 3.19 nanoseconds
integer mul: 1.25 nanoseconds
integer div: 119.11 nanoseconds
integer mod: 45.14 nanoseconds
int64 bit: 2.11 nanoseconds
uint64 add: 2.59 nanoseconds
int64 mul: 2.69 nanoseconds
int64 div: 543.74 nanoseconds
int64 mod: 403.95 nanoseconds
float add: 41.92 nanoseconds
float mul: 33.08 nanoseconds
float div: 172.20 nanoseconds
double add: 62.99 nanoseconds
double mul: 51.67 nanoseconds
double div: 931.12 nanoseconds
float bogomflops: 358.26 nanoseconds
double bogomflops: 1271.33 nanoseconds

lat_pipe
-------------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_pipe
Pipe latency: 30.9000 microseconds

lat_pagefault
---------------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_pagefault ../new.ppt
Pagefaults on ../new.ppt: 4.3572 microseconds

lat_proc
---------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_proc fork
Process fork+exit: 958.5833 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_proc exec
Process fork+execve: 1149.6545 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_proc shell
Process fork+/bin/sh -c: 9566.5000 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_proc procedure
Procedure call: 0.0349 microseconds

lat_rand
--------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_rand
drand48 latency: 397.43 nanoseconds
lrand48 latency: 156.03 nanoseconds

lat_tcp
---------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_select tcp
Select on 200 tcp fd's: 74.4737 microseconds

lat_sem
-------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_sem
Semaphore latency: 7.5261 microseconds

lat_sig
--------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_sig  install
Signal handler installation: 1.7298 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_sig catch
Signal handler overhead: 5.0362 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_sig prot
Usage: ./lat_sig [-P <parallelism>] [-W <warmup>] [-N <repetitions>] install|cat
ch|prot [file]
[root@beagleboard arm-none-linux-gnueabi]# ./lat_sig prot ../new.ppt
Protection fault: 1.0195 microseconds

lat_syscall
-------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall fstat ../new.ppt
Simple fstat: 1.8831 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall open ../new.ppt
Simple open/close: 9.2837 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall stat ../new.ppt
Simple stat: 5.4106 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall write ../new.ppt
Simple write: 0.9867 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall read ../new.ppt
Simple read: 1.1014 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall null ../new.ppt

[root@beagleboard arm-none-linux-gnueabi]# ./lat_syscall null
Simple syscall: 0.5209 microseconds

lat_tcp
-----------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_tcp localhost
TCP latency using localhost: 1.8949 microseconds

lat_udp
--------------
root@beagleboard arm-none-linux-gnueabi]# ./lat_udp localhost
UDP latency using localhost: 67.1474 microseconds

lat_unix
---------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_unix
AF_UNIX sock stream latency: 47.2522 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_unix_connect
connect: No such file or directory

lat_usleep
--------------
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u usleep 100
usleep 100 microseconds: 585.6283 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u nanosleep 100
nanosleep 100 microseconds: 575.0153 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u select 100
select 100 microseconds: 7781.5000 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u pselect 100
Usage: ./lat_usleep [-r] [-u <method>] [-P <parallelism>] [-W <warmup>] [-N <rep
etitions>] usecs
method=usleep|nanosleep|select|pselect|itimer
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u itimer 100
itimer 100 microseconds: 1558.2442 microseconds
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u pselect 100
Usage: ./lat_usleep [-r] [-u <method>] [-P <parallelism>] [-W <warmup>] [-N <rep
etitions>] usecs
method=usleep|nanosleep|select|pselect|itimer
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep -u pselect
Usage: ./lat_usleep [-r] [-u <method>] [-P <parallelism>] [-W <warmup>] [-N <rep
etitions>] usecs
method=usleep|nanosleep|select|pselect|itimer
[root@beagleboard arm-none-linux-gnueabi]# ./lat_usleep pselect
usleep 0 microseconds: 4.9160 microseconds


3.OTHER MEASUREMENTS
-------------------------------------

cache
---------------
cache first attempts to determine the number and size of caches by measuring the memory latency for various memory sizes. 

[root@beagleboard arm-none-linux-gnueabi]# ./cache -M 128K
Memory latency: 6.16 nanoseconds 2.99 parallelism

line
----------
[root@beagleboard arm-none-linux-gnueabi]# ./line
64

lmdd
------------
[root@beagleboard arm-none-linux-gnueabi]# ./lmdd if=internal of=/tmp/file count
=1000 fsync=1
8.1920 MB in 3.4707 secs, 2.3603 MB/sec
[root@beagleboard arm-none-linux-gnueabi]# ./lmdd if=/tmp/usbtree.txt of=/tmp/fi
le count=1000 fsync=1
0.0047 MB in 0.1649 secs, 0.0284 MB/sec

[root@beagleboard arm-none-linux-gnueabi]# ./memsize
64MB OK
64
[root@beagleboard arm-none-linux-gnueabi]# ./mhz
491 MHz, 2.0367 nanosec clock
[root@beagleboard arm-none-linux-gnueabi]# ./msleep
Segmentation fault
[root@beagleboard arm-none-linux-gnueabi]# ./msleep --help
[root@beagleboard arm-none-linux-gnueabi]# ./msleep 100
[root@beagleboard arm-none-linux-gnueabi]# ./msleep 1000
[root@beagleboard arm-none-linux-gnueabi]# ./msleep 10000
[root@beagleboard arm-none-linux-gnueabi]# ./msleep 5000
[root@beagleboard arm-none-linux-gnueabi]# ./msleep 3000


[root@beagleboard arm-none-linux-gnueabi]# ./par_mem -M 1M
measures the available parallelism in the memory hierarchy, up to len bytes
0.004096 3.06
0.008192 3.00
0.016384 3.44
0.032768 1.25
0.065536 1.08
0.131072 1.05
0.262144 1.14
0.524288 1.03

[root@beagleboard arm-none-linux-gnueabi]# ./par_ops
integer bit parallelism: 1.51
integer add parallelism: 1.91
integer mul parallelism: 2.10
integer div parallelism: 1.25
integer mod parallelism: 1.03
int64 bit parallelism: 1.00
int64 add parallelism: 1.08
int64 mul parallelism: 1.00
int64 div parallelism: 1.00
int64 mod parallelism: 1.00
float add parallelism: 1.00
float mul parallelism: 1.04
float div parallelism: 1.00
double add parallelism: 1.00
double mul parallelism: 1.00
double div parallelism: 1.00


[root@beagleboard arm-none-linux-gnueabi]# ./stream -M 128K
STREAM copy latency: 18.79 nanoseconds
STREAM copy bandwidth: 851.53 MB/sec
STREAM scale latency: 121.15 nanoseconds
STREAM scale bandwidth: 132.06 MB/sec
STREAM add latency: 126.25 nanoseconds
STREAM add bandwidth: 190.11 MB/sec
STREAM triad latency: 271.05 nanoseconds
STREAM triad bandwidth: 88.55 MB/sec

[root@beagleboard arm-none-linux-gnueabi]# ./tlb -M 1M
tlb: 31 pages

disk
-----------
./disk ../new.ppt
1.0 1.01
1.0 1.01
1.0 1.01
0.9 1.01
0.9 1.01
0.9 1.01
0.9 1.01
0.9 1.01
0.9 1.01
0.8 1.01
0.8 1.01
0.8 1.01
0.8 1.22
0.7 1.01
0.7 1.01
0.7 1.01
0.7 1.01
0.7 1.01
0.6 1.01
0.6 1.01
0.6 1.01
0.6 1.01
0.6 1.01
0.6 1.01
0.6 1.01
0.5 1.01
0.5 1.01
0.5 1.01
0.5 1.01
0.5 1.01
0.4 1.01
0.4 1.01
0.4 1.01
0.4 1.16
0.4 1.01
0.4 1.01
0.4 1.01
0.4 1.01
0.3 1.01
0.3 1.01
0.3 1.01
0.3 1.01
0.3 1.01
0.3 1.01
0.2 1.01
0.2 1.01
0.2 1.22
0.2 1.01
0.2 1.01
0.1 1.01
0.1 1.01
0.1 1.01
0.1 1.01
0.1 1.01
0.0 1.01
0.0 1.01
0.0 1.13

"Zone bandwidth for ../new.ppt
0.3 30.14
0.8 25.57
1.3 22.72
1.8 21.42
2.4 24.68
2.9 25.41
3.4 23.15
3.9 21.42
4.5 24.97
5.0 22.72
5.5 33.95
6.0 25.57
6.6 23.60
7.1 22.20
7.6 34.50
8.1 25.41

enough
-------------
[root@beagleboard arm-none-linux-gnueabi]# ./enough
10000

Download lmbench script File:Lmbench script.zip