Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.

TI SATA FAQ

From Texas Instruments Wiki
Jump to: navigation, search
TIBanner.png
Texas Instruments Serial ATA FAQ
Mansoor Ahamed
IMPORTANT

DM816x refers to DM816x/AM389x devices unless specified.
DM814x refers to DM814x/AM387x devices unless specified.
DM81xx refers to both DM816x, DM814x and DM813x.
AM18x refers to AM18x/OMAP-L138 devices unless specified.
SATA Controller, Host, and HBA refers to the SATA controller in the TI device

Contents

SATA Background

What is SATA

SATA stands for Serial ATA. This is a bus interface for connecting "host bus adapters" to mass storage devices like hard disk drives and optical drives. SATA was designed to replace parallel ATA bus.

ATA vs SATA

ATA stands for Advanced Technology Attachment which is a 16-bit parallel bus introduced in 1986. Undergone many evolution in the last 25 years. Maximum speed supported is only 133MB/s.

What is PATA

All of the below synonyms refer to a modern day PATA drive,

  PATA – Parallel Advanced Technology Attachment
  UDMA – Ultra Direct Memory Access
   IDE – Integrated Device Electronics
  EIDE – Enhanced IDE

Features

  • 40 & 80 wire cable option
    • 40 wire limited to UDMA 33 MB/s and below
    • 80 wire allowed for UDMA 66, 100, 133 MB/s
  • Should be 5v tolerant (3.3v has been the norm for several years)
  • Must support Master/Slave/Cable Select

Limitations of PATA

  • Signal timing
  • EMI ( electromagnetic inteference )
  • Data integrity issues

Parallel vs Serial

Parallel vs Serial Data Transfer

PATA vs SATA

Refer Figure 2 – Parallel ATA device connectivity and Figure 3 – Serial ATA connectivity in http://arc.opensolaris.org/caselog/FWARC/2008/013/commitment2.materials/specifications/SerialATA_Revision_2_6_Gold.pdf

What is ATAPI

AT Attachment Packet Interface (ATAPI) provides additional commands on top of ATA to control CD/DVD drives. If we support CD/DVD drives then we can claim that we also support ATAPI (the other way is also true).

SATA Advantages

  • Four wire replacement
  • Star topology ( Point to Point ) - Each device gets full bandwidth
  • Reduces cable & connector costs
  • Higher data rate ( 150 MB/sec, 300MB/sec, 600MB/sec )
  • SATA-I ( 150MB/sec)
  • SATA-II ( 300MB/sec)
  • Hot Pluggable

What is NCQ

  • Ability to issue multiple commands to the drive & allow the device to reorder the same.
  • Only device knows disk organization and angular position hence it has the freedom to optimally reorder commands.
  • Can reorder upto 32 outstanding commands
  • Wikipedia has a very good article on NCQ


Cables & Connectors

Cable / Connector Connectivity Diagram

Hard Disk Side Connector Details

Hard disk drive connector details

How important role does a cable play in SATA validation

Low grade SATA/eSATA cables do not have proper shielding and this will lead to data loss and sometimes the host will not even enumerate connected device. There are chances of even reducing the link speed to 1.5Gbps.

Do you have any recommendation for SATA cables

Cables from Foxconn and Molex are good. Our software validation team uses Foxconn and few VegaTech cables.

eSATA

TI Host supports SATA or eSATA?

SATA is mainly used for internal storage and eSATA for external storage. eSATA has different electrical requirements and it can support cable length of upto 2 meters. The SATA PHY used in TI devices can support both SATA and eSATA. The PHY configuration values used in DM81XX software releases will support both SATA and eSATA. But officially TI does not claim eSATA support, please contact TI Sales Team for more information. For more information on eSATA please refer [[1]]

Why TI EVM does not have eSATA connector

Even though TI devices can support eSATA, TI does not claim eSATA support.

Our intention was to primarily validate SATA devices.

Also, the cable length requirements for our EVM was less than 2'0" and hence we would mostly use the device as internal storage. Very few storage device in the market have eSATA connector. Also, the devices/cables we have for SATA validation came with SATA connector (and not eSATA). Refer [[2]] to view the difference between SATA and eSATA connector.

If the EVM did not have eSATA connector, how will I connect a eSATA device with our EVM

There are cables available in the market with regular SATA connector at one end and eSATA connector at the other end. These cables could be used to connect the device to our EVM.

Does the Linux driver require any change specific to eSATA support

Linux does not require any change for eSATA support. Only grey area might be hotplug. Even Intel x86 based motherboards which claim eSATA support had hotplug issues. We need to validate hotplug support with different eSATA devices before claiming eSATA support in Linux.

Have we validated our EVM with a device which has eSATA interface

Yes, we have validated DM8168 EVM with devices which have eSATA interface. Some of the devices used are, Addonics Port Multiplier (PMP) and Addonics RAID storage tower. More information on PMP and RAID is available below (in this FAQ).

Port Extension (Port Multiplier - a.k.a PMP)

What is a Port Multiplier (PMP)

A device used to connect multiple SATA devices to a single SATA host. Its functionality is similar to a USB hub (but not exactly same). For more information refer [[3]]

What are the different ways of achieving Port Multiplication

Port Multiplication can be achieved through Command Based Switching (CBS) or FIS Based Switching (FBS). Please refer [[4]] for the differences. A SATA host can support both CBS and FBS based Port Multiplication.

Command Based Switching (CBS) - Will not issue commands to another drive until command queue is completed for the current drive

FIS Based Switching (FBS) - Host can issue and complete commands to any drives in any order

Please refer animation on [Port Multiplier article in 'www.serialata.org'] for more information on differences between CBS and FBS.

Which is advantageous, CBS or FBS Port Multiplication

Host could achieve better performance in case of FBS. This is because, Native Command Queuing (NCQ [[5]]) could not be used effectively in CBS. In case of CBS, any outstanding transaction with a device has to be completed before issuing a command to another device connected to the PMP.

Does TI host support FIS Based Switching Port Multiplication

No, TI SATA host support only Command Based Switching PMP.

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} SATA host on AM1808, DM81xx are AHCI 1.1 compliant. AHCI 1.1 does not support FIS based switching. If we have to support FIS based switching then the host should be compliant to AHCI 1.2 or greater.

What is the repercussion of not supporting FBS in our SATA host

If any slow speed device (like a CD writer) is connected to the port multiplier then the other high speed device connected to the same port multiplier will not be able to achieve the original sustained transfer rate. For example, let us consider a scenario where a CD writer and 3Gbps SATA HDD are connected to a PMP. The sustainable write rate of the HDD without PMP will be around ~70MBytes/sec (EXT2 file-system). If CD writing operation is used in parallel with HDD write then the performance of HDD could come down by a minimum of ~20%. If VFAT file-system is used in the same scenario then performance might drop by ~75%.

Even if two HDDs are connected to a PMP (without a CD writer), the performance will drop because of CBS.

What is the recommendation for using PMP with TI SATA host

If customer is planning to use a PMP then it is recommended not to connect a slow speed device like a CD writer to the PMP and instead connect it to th other port available on the EVM (applicable only to DM816X). If the customer has a mandatory requirement of connecting slow speed device to PMP then it is recommended to use a Linux native file-system like EXT2 and not VFAT on the HDDs connected to the PMP. Connecting the slow speed device via PCIe interface is another solution

How many devices can be connected to a PMP

Max 31 devices could be connected to a PMP but most of the devices in the market have 2 to 5 device (ports) support.

Internet SCSI (iSCSI)

What is iSCSI

iSCSI is IP based storage networking standard for storage facilities. Wikipedia has a good article on iSCSI [6]. Please read this article before continuing.

Does TI SATA controller support iSCSI

iSCSI is a standard and has nothing to do with the SATA controller hardware. The iSCSI initiators usually do not require a SATA controller. iSCSI targets require a storage device with SCSI commands support. Hence iSCSI target might have a SATA controller for connecting SCSI devices and TI SATA controller would definitely fit there.

iSCSI software support

Linux has initiator, target and multipath iSCSI support. But we have not validated any of them on TI devices. Please check Wikipedia [7] for other software support

TI SATA Host

What are the TI SOCs which have SATA support

AM18xx, DM816x, DM814x

TI SATA Controller Features

Table: TI SATA Controller Features
Features                          
DM816X                                          
DM814X      
AM18X                              
Gen
2
2
2
Speed
min 1.5Gbps and max 3Gbps
min 1.5Gbps and max 3Gbps
min 1.5Gbps and max 3Gbps
Integrated PHY
YES
YES
YES
Integrated Rx& Tx buffers
YES
YES
YES
Power Management Features
YES
YES
YES
DMA Engine
Internal (dedicated)
Internal (dedicated)
Internal (dedicated)
Hardware assisted NCQ
YES, 32 entries
YES, 32 entries
YES, 32 entries
Port Multiplier
YES, only Command Based Switching
YES, only Command Based Switching
YES, only Command Based Switching
Interface/Functional Clock
250Mhz
200Mhz
150Mhz
PHY Clock
100Mhz
100Mhz
100Mhz
Activity LED support
YES
YES
YES
Cold Presence detect
NO
NO
YES
Mechanical Presence Switch
NO
NO
YES
Number of AHCI Ports
2
1
1
RXFIFO Depth
128 DWords
128 DWords
64 DWords
TXFIFO Depth
64 DWords
64 DWords
32 DWords
64-bit Addressing NO NO NO
BIST Loopback FIS FIS FIS
eSATA NO* NO* NO*
Gen 3 device support NO* NO* NO*


{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} *Contact TI Sales Team for specifics.

What is the maximum link speed supported on DM81XX / AM18X

The SATA IP used in DM81XX / AM18X is Gen 2 and hence support 3Gbps.

Can our host work with devices which support only 1.5Gbps

Yes, all Gen 2 hosts are backward compatible with Gen 1 devices and hence it will work with 1.5Gbps devices.

Will it work with devices which can support 6Gbps

If the device under question is backward compatible with Gen 2 then it should work at 3Gbps with our SATA host.

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} We have observed interoperability issues with some of Gen 3 devices. Hence we cannot guarantee compliance without testing the device under question.

Why different SOCs have different PHY

The re-use of PHY depends on power requirements of SOC. The PHY selection depends on power and process technology requirement.

How are the PHY configuration values used in the software calculated

These values are usually decided based on discussions with Silicon validation and IP team. The chances of these values getting modified drastically is very less. Please refer SATA programming guide of respective TI soc for description on PHY configuration registers.


In DM816X, does the 2 ports work independently

A common question on SOCs with multiple SATA port is, Does the SATA ports work independently? E.g. In case one port die, can the others be still alive?

Some facts are,

  • As per AHCI, "single HBA can support up to 32 ports, also an SOC can have multiple HBAs with 1 port per HBA". DM816X fits in former, it has one HBA and 2 ports.
  • In DM816x, the two ports have independent DMA. Hence they should be treated independent for performance measurement.
  • But if the HBA is put in reset then both ports will be in reset.
  • A link failure on one port will not affect the other.

Port Selector (PS)

Do we support Port Selector (PS)

Port Selector is 2-input-to-1 SATA analog multiplexer for host controller failover applications. Our device should work but this feature is not validated. [This section will be updated later for more information].

RAID

Do we support hardware RAID

No, TI Host controller does not support hardware RAID. But RAID could be achieved either through Linux software or through external hardware RAID.

Do we support software RAID

Linux as such supports software RAID (1/5/10) but we have not validated this feature. Please refer [[8]] and [[9]] for more info on software RAID.

Do we support external hardware RAID

External hardware RAID is a blackbox for us. Usually external RAID storage tower vendors provide RAID configuration tool. These are either x86 Windows or x86 Linux host based software tool. Either we have to port these tools to ARM Linux (DM81XX) or we have to use a pre-configured external RAID storage. During our testing we configured RAID devices from Windows XP host for RAID-0 & JBOD and used it with our platform.

Devices Supported

Name some devices that could be connected to a SATA interface

SATA interface could be used to connect following devices. Please note that these devices should have SATA/eSATA connector at their end.

  • HDD
  • CD/DVD drives
  • S/W RAID
  • External RAID storage towers
  • Solid State Devices
  • Port Multiplier
  • Port Selector
  • Blu-Ray

From the above list what are the devices validated with our EVMs

  • HDD
  • CD/DVD drives. Both laptop and external form-factors
  • External RAID Storage towers [Only as a device no configuration support. Vendor provided tool should be used on x86 Linux, Windows machine to configure RAID]
  • Port Multiplier

Is there any other means of connecting a SATA device to our EVM

Even if the SOC/EVM does not have SATA support we could connect SATA devices either through PCIe or USB.

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} This is just a recommendation, we have not validated anything related to this.

How do we provide power to SATA devices

All SATA devices have separate power supply with separate connector. They work on 12v, 5v, 3.3v. Some EVMs (like DM8168) come with power source for SATA devices. The regular PC SMPS could also be used to power SATA devices like HDD, CD/DVD drives and custom PMP PCB. The power connector for laptop CD/DVD drives are different and hence you will require a converter cable. Refer [[10]] for different power cables. Some port multipliers and external storage towers (including RAID) come with their own power adapter.

Can we connect Blu-Ray to SATA interface

Yes, if the Blu-Ray device has a SATA interface then we can connect it to our host. This was not validated.

How do I make a HDD work at 1.5Gbps

This will be required when you are suspecting that a device is not working at 3Gbps and Linux is reverting to 1.5Gbps. You could do this using the jumpers on the HDD. Please refer the HDD datasheet for the right selection. But not all HDDs support this feature.

OS Support

Do we have non-OS software for SATA

The AVV team uses a software to validate the basic functionality of the SATA interface. But this software is usually not shared externally. Please contact TI sales team for more information.

Do we support BIOS

Yes, but available only for AM18X devices. Please contact BIOS PSP Lead for any query related AM18X BIOS SATA support.

Do we support Linux

Yes, Linux support is available for all devices (AM18X, DM816X, DM814X)

Linux Support

What is the status of Linux SATA support

Linux supports SATA as part of AHCI framework. Till 2.6.35 AHCI was only supported through PCI and hence all older release from TI used custom patches for platform AHCI.

What is the status of Platform AHCI

From 2.6.36 Linux has AHCI platform support. All TI releases with kernel version >= 2.6.37 have AHCI platform support.

What are the major bugs in mainline

As of 19-Jan-2011 (2.6.38-rc1),

  1. If PMP is enabled in kernel configuration host will not enumerate device connected directly to the host. Hence we have to use a PMP to connect a device. This issue is resolved, but we have the fix as a custom patch available only in TI releases (not in mainline).

How to disable/enable NCQ for SATA device in Linux

  For disabling NCQ ==> echo 1 > /sys/block/sda/device/queue_depth
  For enabling  NCQ ==> echo 31 > /sys/block/sda/device/queue_depth

NCQ could be enabled/disabled at per device level. In the above example, <sda> is the device for which we are disabling the NCQ.

What are the files related to SATA in Linux kernel source

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} Kernel Version >= 2.6.37

DM81xx Files

  • arch/arm/mach-omap2/devices.c: AHCI platform device registration part with PHY initialization for DM81xx devices
  • arch/arm/mach-omap2/clock816x_data.c: Clock data for DM816x device
  • arch/arm/mach-omap2/clock814x_data.c: Clock data for DM814x device

AM18x Files

  • arch/arm/mach-davinci/devices-da8xx.c: AHCI platform device registration part with PHY initialization for AM18x device
  • arch/arm/mach-davinci/da850.c: Clock data for AM18x device
  • arch/arm/mach-davinci/board-da850-evm.c : Register AHCI platform device

Common Files: These common files and we generally do not modify these files

  • drivers/ata: This folder has all ATA, AHCI and SATA related files.
  • drivers/scsi: SCSI Layer

What are the kernel configuration options related to SATA support

NOTE

* This is based on Kernel Version 2.6.37
  • This section captures only the mandatory components used during our validation. There could be other components which might be required for a specific usecase.
  • By default these components will be built with the kernel. To build them as loadable modules, please use option [M] during kernel configuration (make menuconfig).

ATA

This menu could be found during menuconfig in the following location

  Device Drivers  --->
    <*> Serial ATA and Parallel ATA drivers  --->
{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, all module binaries (.ko) should be available in drivers/ata folder.
  • CONFIG_ATA : Serial ATA and Parallel ATA support
  • CONFIG_ATA_VERBOSE_ERROR
  • CONFIG_SATA_PMP : This is for Port Multiplier support
  • CONFIG_SATA_AHCI : PCI / PCIe based AHCI
  • CONFIG_SATA_AHCI_PLATFORM : Platform AHCI support (example, DM81xx and AM1808)

SCSI

This menu could be found during menuconfig in the following location

  Device Drivers  --->
    SCSI device support  --->
{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, all module binaries (.ko) should be available in drivers/scsi folder.
  • CONFIG_BLK_DEV_SD : SCSI DISK support
  • CONFIG_BLK_DEV_SR : SCSI CD/DVD support
  • CONFIG_BLK_DEV_SR_VENDOR
  • CONFIG_CHR_DEV_SG : SCSI generic support (scanner, printer)


File-system specific kernel configuration

This menu could be found during menuconfig in the following location

  File systems  --->

EXT2

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, all module binaries (.ko) should be available in fs/ext2 folder.
  • CONFIG_EXT2_FS
  • CONFIG_EXT2_FS_XATTR
  • CONFIG_EXT2_FS_POSIX_ACL
  • CONFIG_EXT2_FS_SECURITY

EXT3

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, all module binaries (.ko) should be available in fs/ext3 folder.
  • CONFIG_EXT3_FS
  • CONFIG_EXT3_FS_XATTR
  • CONFIG_EXT3_FS_POSIX_ACL
  • CONFIG_EXT3_FS_SECURITY

CD/DVD

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, module binary (.ko) should be available in fs/isofs folder.
  • CONFIG_ISO9660_FS

Misc

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} If built as module, all module binaries (.ko) should be available in fs/fat folder.
  • CONFIG_FAT_FS
  • CONFIG_MSDOS_FS
  • CONFIG_VFAT_FS
  • CONFIG_FAT_DEFAULT_CODEPAGE=437
  • CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"

Loadable Module Support for Platform AHCI driver

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} This is based on Kernel Version >= 2.6.37
  • This section assumes that necessary environment variables (like ARCH and CROSS_COMPILE) are set for your platform.
  • As mentioned in previous section, every component could be built as module. This section explains it with respect to platform AHCI support and DM816x platform.
  • Configure the kernel with defconfig which came in the release, usually the defconfig configures it as "built with the kernel"
  make ti8168_evm_defcongfig
  • Modify the platform AHCI driver to be built as a loadable module. Run make menuconfig (or similar kernel configuration command), navigate to the below mentioned item and set it as [M]
  Device Drivers  --->
     <*> Serial ATA and Parallel ATA drivers  --->
             <M>   Platform AHCI SATA support
  • Build the kernel (this output image should be used for booting the target)
  make uImage
  • Build the platform AHCI driver as module
  make modules
  "after completion of build, module binary should be available in drivers/ata/ahci_platform.ko"
  • Copy the modules binary to the target and load it manually using following command (which is run from the target)
  insmod ahci_platform.ko
  • If libahci, ahci, and ahci_platform were built as module then following sequence should be used during insertion and removal of modules.

Module insertion sequence based

insmod libahci.ko
insmod ahci.ko
insmod ahci_platform.ko

Module removal sequence

rmmod ahci_platform.ko
rmmod ahci.ko
rmmod libahci.ko
  • If you are using modprobe then please refer Wikipedia article on modprobe.

How do I enable debug & log messages in Linux

Following kernel configuration variables will enable logging and debug message support.

  • Enable ATA verbose messages:CONFIG_ATA_VERBOSE_ERROR. Enabled from following menu during menuconfig,
  Device Drivers  ---> 
     [*] Serial ATA and Parallel ATA drivers  --->
            [*]   Verbose ATA error reporting
  • Verbose SCSI error reporting:CONFIG_SCSI_CONSTANTS. Enabled from following menu during menuconfig,
  Device Drivers  ---> 
     SCSI device support  --->
        [*] Verbose SCSI error reporting (kernel size +=12K)
  • SCSI logging facility:CONFIG_SCSI_LOGGING. Enabled from following menu during menuconfig,
  Device Drivers  --->
     SCSI device support  --->
        [*] SCSI logging facility 
{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} SCSI logging could be intercepted using syslogd or klogd, please refer respective man pages for configuration details.
  • SCSI supports following events and each event supports 0 to 7 log levels (3 bits), where 0 is logging disabled and 7 is max verbose. The loglevel bitmask is a 32-bit value whose bit fields are explined below.
  error       : Bits   2:0, sets error log-level
  timeout     : Bits   5:3, sets timeout log-level
  scan        : Bits   8:6, sets device scanning related log-level
  mlqueue     : Bits  11:9, sets SCSI middle-layer queuing related log-level
  mlcomplete  : Bits 14:12, sets SCSI middle-layer completion related log-level
  llqueue     : Bits 17:15, sets SCSI low-level driver queue related log-level
  llcomplete  : Bits 20:18, sets SCSI low-level driver completion related log-level
  hlqueue     : Bits 23:21, sets SCSI high-layer queue related log-level
  hlcomplete  : Bits 26:24, sets SCSI high-layer completion related log-level
  ioctl       : Bits 29:27, sets SCSI ioctl call events related log-level
  all         : All bits '1', which is 0xFFFFFFFF. Enables logging for all events with max verbosity for each event.
                This event string could be used only in legacy procFS mode.
  none        : All bits '0', which is 0x00000000. Disables logging for all events.
                This event string could be used only in legacy procFS mode.
Legacy Support {{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} In Linux 2.6 this has been superseded by files in sysfs but many legacy applications rely on this. This might be removed in future kernel releases.
  • Enable Legacy SCSI procfs entry CONFIG_SCSI_PROC_FS during menuconfig from,
  Device Drivers  --->
      SCSI device support  --->
          [*] legacy /proc/scsi/ support
  • The level of logging could be tuned using proc entry /proc/scsi/scsi. For example,
   echo “scsi log all” > /proc/scsi/scsi ==> will enable logging all events. 
  • Logging could also be tuned for a specific event. For example,
   echo “scsi log timeout 5” > /proc/scsi/scsi  ==> would set log_level 5 for timeout events.
  • For more information please refer [11].

Linux Kernel 2.6 onwards

  • Use proc entry /proc/sys/dev/scsi/logging_level for tuning log level. You need to pass bitmask values for setting up the log-level. Please refer drivers/scsi/scsi_logging.h in the kernel source directory for bitmask values. Use echo command to set the log-level. The below example would enable logging all events with max verbosity
  echo 0xFFFFFFFF > /proc/sys/dev/scsi/logging_level
  • Same could also be enabled using kernel boot parameter scsi_logging_level. Pass this using the bootargs variable in u-boot. It would look something like scsi_logging_level=0xFFFFFFFF, this would enable logging all events with max verbosity.

Linux IO scheduler

What are IO schedulers

IO scheduler decides the order in which block operations are submitted to the storage device. It schedules the pending I/O requests in order to minimize the time spent moving the disk head. This, in turn, minimizes disk seek time and maximizes hard disk throughput. It accomplishes this by sorting the requests based on block numbers and also merging occurs when an I/O request is issued to an identical or adjacent region of the disk. Instead of issuing the new request on its own, it is merged into the identical or adjacent request. This minimizes the number of outstanding requests.

What are the different IO schedulers supported by Linux (version > 2.6.18)

      NOOP => No IO scheduling
  DEADLINE => Imposes deadline on all I/O operations to prevent starvation of requests
       CFQ => "Completely Fair Queuing". Distributes bandwidth equally among all processes in the system

The default IO scheduler in Linux is CFQ. The IO scheduler scheme is globally selected in kernel configuration. At runtime, IO scheduler for a device could be changed using sysfs entries.

How to identify which IO scheduler a device is using

  cat /sys/block/sda/queue/scheduler
noop deadline [cfq]

In the above example device sda uses "cfq" IO scheduler.

How to modify an IO scheduler for a device

  echo "deadline" > /sys/block/sda/queue/scheduler

This command will switch to "deadline" IO scheduler

SATA Validation

How SATA driver is validated before software release

SATA has a test matrix which will be used for sanity before every release. Most of the test cases are covered in the LFTB filesystem test suite. Currently LFTB filesystem test suite supports performance, functionality and stress test for EXT2/VFAT filesystems. Some test cases like dual port support have to be done manually with LFTB help.

What are the filesystems validated on SATA

We have validated EXT2, VFAT for HDD. Also, we have validated iso9660 for CD.

Filesystems

What are the filesystems supported by Linux PSP releases

We use EXT2 and VFAT for all our validation. Other file filesystems should also work but we have not tested them.

Which filesystem is better for SATA

It depends on the host operating system. Since all our releases are Linux based, EXT2, EXT3, EXT4 are better.

Why VFAT is not best suited for Linux with respect to performance

VFAT has lot of limitations. Also, VFAT is not a Linux native filesystem. Current VFAT support in kernel is not evolving and it is just maintained as-is.

Which filesystem is better, EXT2 or EXT3

EXT2 wins on performance and EXT3 wins on reliability. EXT3 uses journaling and is more power failure safe. But EXT4 has proved to be better than EXT3 in all ways. Refer Linux EXT FAQ from kernel.org. Wikipedia also has a good article on Filesystem Comparison

Tools for creating filesystems

VFAT : mkfs.vfat

EXT2 : mkfs.ext2

EXT3 : mke2fs

EXT4 : mke2fs

mke2fs angstrom package

SATA Performance and Profiling

Setup:

DM816X: CPU->1Ghz,   DDR3->400Mhz, Seagate Barracuda 7200.11, Linux 2.6.37,     Linux PSP Release->04.00.00.10
DM814X: CPU->600Mhz, DDR2->400Mhz, Seagate Barracuda 7200.11, Linux 2.6.37,     Linux PSP Release->04.01.00.03
 AM18X: CPU->300Mhz, mDDR->132Mhz, Seagate Barracuda 7200.11, Linux 2.6.33-rc4, Linux PSP Release->03.20.00.14

TOOL: Linux Functional Test Bench (LFTB) version 02.00.00.03, Filesystem test suite

For profiling please refer [Linux Oprofile]. Also, refer the actual product data sheet for detailed test setup.

LFTB releases are available in [LFTB Arago GIT]

EXT2 Write Performance

Buffer Size (KB)
Total Bytes Transferred (MB)

Transfer Rate (MBps)

CPU Load (%)

DM816X 
DM814X 
AM18X 
DM816X 
DM814X 
AM18X 
100
100
89.84
74.37
23.17
31.90
40.85
98.46
1024 100 77.22 70.47 23.67 25.74 40.27 97.98


EXT2 Read Performance

Buffer Size (KB)
Total Bytes Transferred (MB)

Transfer Rate (MBps)

CPU Load (%)

DM816X 
DM814X 
AM18X 
DM816X 
DM814X 
AM18X 
100
100
114.12
114.98
39.61
46.15
68.13
98.88
1024 100 107.29 107.29 39.82 42.86 66.67 99.25


VFAT(FAT32) Write Performance

Buffer Size (KB)
Total Bytes Transferred (MB)

Transfer Rate (MBps)

CPU Load (%)

DM816X 
DM814X 
AM18X 
DM816X 
DM814X 
AM18X 
100
100
64.75
57.08
NA
45.68 64.67
NA
1024 100 63.84 54.58 NA
43.90 64.50 NA


VFAT(FAT32) Read Performance

Buffer Size (KB)
Total Bytes Transferred (MB)

Transfer Rate (MBps)

CPU Load (%)

DM816X 
DM814X 
AM18X 
DM816X 
DM814X 
AM18X 
100
100
110.31 106.10 NA
42.11 70.41 NA
1024 100 109.29 106.93 NA
45.26 73.47 NA


Why peak performance published is no where close to 300MBytes/sec range

The 300MBytes/sec is the theoretical data rate achievable for 3Gbps link speed. The data rates published in the datasheet are the sustained data rates which include file-system overhead. Since the benchmarks run from userspace there is a copy involved from userspace to kernel space which is also an overhead. Also, the sustained data rate published by manufacturer for HDD used in our tests is only ~150MBytes/sec and we are close to this number.

Why copy_from_user() and copy_to_user() are bottlenecks

These two APIs are not really bottleneck for performance but for cpu usage. For every HDD write from an application, the buffer is first copied from userspace to kernelspace, then to page cache and finally DMAed to the HDD using the SATA controller DMA. Except for the final DMA all other copies are done by the CPU. Eventhough these CPU copies are optimized in Linux kernel there could be L2 cache misses which would stall the CPU. When we profiled the SATA transaction using 'oprofile', the top contributors towards CPU load are these copy operations. And also, these APIs are the top contributors for L2 cache miss. Hence if we could avoid these copies then we can improve CPU utilization if not the performance.

Why can't we replace copy_from_user() and copy_to_user() with DMA

It is not feasible due to following reasons.

  • The copy operation in Linux kernel discussed here is an atomic operation where other processes cannot be scheduled. Hence we cannot trigger a DMA and sleep. So the DMA has to be in polled mode.
  • The from and to locations for the DMA have to be made cache coherent (for the length of the copy). For example, in case of write, we have to flush the from region to DDR and invalidate the to region cache entries. This is a huge overhead.
  • The length of the copy operation in Linux is usually not more than 4KiB. Hence using CPU for copying makes more sense than using a DMA.
  • Based on the experiments we did on DM816X, it is clear that the advantage that should have been got by moving to EDMA is lost due to cache management.

Please refer to the How to Tune Performance and CPU utilization FAQ below.

Why different TI devices have different performance numbers

Even though all these devices use the same SATA host controller IP there are other parameters which will affect SATA performance. For example, DM816X runs at 1Ghz and hence there is huge difference from between DM816X and AM18X.

What are the factors leading to performance difference between TI device

Processor speed, Memory speed, Cache size and Linux kernel version.

DM8168 EVM have two SATA ports, will there be any performance drop if two devices are connected (one to each port)

NO, each port has separate DMA hence the performance will not drop. But the CPU usage will increase.

Will the performance improve for DDR3

Yes, DDR3 @ 1600MHz data rate can give better performance than DDR2 @ 800MHz data rate.

How to improve VFAT performance in Linux

We can improve performance by tuning VFAT parameters during filesystem creation. The two parameters which could be considered are, number of sectors per cluster and logical sector size. Modifying these parameters will vary the disk usage space and disk space wastage. Hence tradeoff has to be made between disk size and performance. Also, note that all settings might not be compatible with Window OS. The table below captures the command and the improvement.

Command

Sectors

per

cluster

Logical

sector size

(bytes)

Improvement Description
mkfs.vfat –F 32 /dev/sda1
8
512
1X
Default - FAT 32 Filesystem
mkfs.vfat -F 32 -s 32 -S 512 -v /dev/sda1
32
512
2X
Might be compatible with Windows OS
mkfs.vfat -F 32 -s 16 -S 4096 -v /dev/sda1
16
4096
3X
Might not be compatible with Windows OS

How do we interpret IOWAIT reported by Linux commands like "top", "iostat" and "mpstat"

Usually "iowait" results are misinterpreted. "iowait" is actually a portion of "cpu idle". Which means amount of time a process has been idle waiting for an IO to complete. Hence the CPU is free for this duration. But the iowait numbers could also be used to analyze the IO performance of the system.

Please refer respective man pages for details on these command.

What are the open source tools for benchmarking IO

iometer: is an opensource tool which could be used for benchmarking system IO. This tool was basically developed by Intel for x86 systems. Later it became a opensource project and now used widely. This tool has to be cross-compiled for ARM to use in our platform. We have not used/validated this tool. Visit [12] for more details.

bonnie++: Visit [13] for more details

hdparm: Wikipedia has a good article on hdparm [14]. Simply run this command to get a complete list of options. This tool can be downloaded from [Angstrom-Distribution] for different ARM architectures. Some basic examples of this tool are,

  hdparm -i /dev/sda ==> Displays drive identification.
 hdparm -t /dev/sda ==> Performs device read timings on /dev/sda and reports the numbers
 hdparm -t --direct /dev/sda ==> Uses O_DIRECT flag while opening the device. It bypasses the page cache for timings.


How to tune Performance and CPU utilization

As we discussed earlier, Filesystem and userspace <-> kernelspace copy operation is the bottleneck for improving performance. To get the best performance/cpu utilization during SATA operation following could be tried

  • Do not use filesystem, instead open the raw device (for example /dev/sda) and perform read on that device
  • Use O_DIRECT flag while opening the device. This will make sure that we will not use page cache for our operation. If O_DIRECT flag is used then the buffer size and buffer address should be aligned to block size of the device (of the filesystem, if filesystem is used). Please check the [man page] for more information.
  • Do not use O_NONBLOCK flag while opening the device. This will make sure the buffer being passed to the read()/write() syscall is free for user space usage when the read()/write() syscall returns.
  • Make sure the buffer size is big (i.e. > 256KiB). This will improve the performance and also CPU usage.
  • Buffers from cacheable memory regions will yield better performance/cpu utilization numbers (in system scenario).
  • Lock the memory used by your application using mlockall() and unlock at the end of the application using munlockall(). This will make sure that all pages required for this application are always in memory and not swapped or flushed out.
  • Avoid using fread()/fwrite(). libc will try to buffer the requests in this case.
  • Check [hdparm] source for reference implementation.
{{#if: Before starting to use O_DIRECT please go through the mail thread from Linus which debates on O_DIRECT usage. Linus inclined towards usage of posix_fadvise() and madvice() instead of O_DIRECT. These two APIs could be used to direct kernel on usage of page cache for a buffer.|
}}Warning Warning: {{#if: Before starting to use O_DIRECT please go through the mail thread from Linus which debates on O_DIRECT usage. Linus inclined towards usage of posix_fadvise() and madvice() instead of O_DIRECT. These two APIs could be used to direct kernel on usage of page cache for a buffer.|Before starting to use O_DIRECT please go through the mail thread from Linus which debates on O_DIRECT usage. Linus inclined towards usage of posix_fadvise() and madvice() instead of O_DIRECT. These two APIs could be used to direct kernel on usage of page cache for a buffer.
}}

Do we have performance numbers for O_DIRECT flag

We have done some testing on DM816X using O_DIRECT flag. The 'dd' tool could be used with 'iflag=direct' or 'oflag=direct' option.

EVM        : DDR2 @ 400MHz
HDD        : WD5001AALS
Link Speed : 3Gbps
Data Size  : 2GB
Board      : DM816X EVM
Read cmd   : dd if=/dev/sda of=/dev/null iflag=direct bs=<block_size> count=<count>
Write cmd  : dd if=/dev/zero of=/dev/sda oflag=direct bs=<block_size> count=<count>
NOTE       : Used oprofile for profiling CPU utlization. No filesystem used.

The graph below captures performance/cpu utilization for different buffer sizes.

SATA O_DIRECT Graph

Refer File:TI-SATA-FAQ-O DIRECT-Testing.zip for more info.

Will there be any performance degradation if a Port Multiplier is used

Theoretically the numbers should come down as our Host does not support FIS based switching. If we are connecting only one disk to the PMP then we will not see much difference. But if we are connecting more disks to the PMP then the performance would degrade exponentially.

For example, with 3 HDDs and simultaneous READ access to all three, we would get 10MBps per HDD at 75% CPU load. Similarly for write, we would get 15MBps per HDD at 80% CPU load.

Why AM18X BIOS performance numbers are better than Linux number

The BIOS numbers are not published here but they are better than Linux. This is mainly due to how BIOS and Linux architect their software Layers. BIOS SATA stack is lightweight and hence it would give us better performance but the Linux stack is feature rich.

Why does HDD performance better in a PC/Desktop

TI SOC SATA throughput cannot be directly compared with the PC throughput numbers for various reasons. For example, a x86 PC would have 2MB or 4MB of L2/L3 cache and the processor itself will be running at a very high speed (2.5Ghz). Also, most of the PCs are multi-core. DM8168 has 256KB of L2 cache and the core operates at 1Ghz. Hence there will be a performance difference.

What are the overheads impacting SATA performance

The raw throughput numbers claimed by the drive manufacturer includes data throughput and SATA protocol handshake. Usually there is a huge overhead on the bus due to this handshake. Also, SATA uses 10B8B encoding. So for every 8-bit data, in the bus you should see 10-bit data. So, if we are achieving 100MBytes/sec it actually means 100MBytes/sec + 10B8B encoding overhead + SATA handshake overhead. This overhead will be ~ 10 to 20% on the actual data. Hence for 100MBytes/sec of data, on the bus it might be ~120MBytes/sec.

There are various other overheads apart from the above stated. For example, the filsystem uses metadata, there is a userspace to kernel space (and vice versa) copy involved in every read/write operation and then the data will be DMAed to the SATA controller. The userspace <-> kernel space copy operation is done by the CPU (hence CPU speed is a constraint) and it has to take care of Cache coherency (so cache size is also a constraint). There is also a overhead due to interrupts and context switches in the operation system. Also, there are also various layers in the operating like userspace <-> Virtual File System <-> Actual File System <-> SCSI <-> ATA <-> AHCI <-> DMA. But the data itself is not copied in all cases (only pointers). So the operating system itself in a way is a overhead.

CD/DVD Writing Tools

Are there any opensource tools for CD/DVD burning

The opensource cdr-tools could be used for CD/DVD writing. This tool should be cross-compiled for ARM. Please check the instructions given in the tool source for more info.

Did we validate CD writing using our SATA interface

Yes, we have used opensource cdrtools with customer provided wrapper for validating CD writer.

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} The wrapper cannot be shared as we received it from a customer for testing their modules.

Power Management (PM)

SATA Power Management Features

SATA has following power management feature,

  • Phy Ready – Capable of sending and receiving data. Main phase locked loop are on and active.
  • Partial – Physical layer is powered but in a reduced state. Must be able to return to Phy Ready within 10us.
  • Slumber – Physical layer is powered but in a reduced state. Must be able to return to Phy Ready within 10 ms.
  • ATA also defines IDLE, STANDBY, and SLEEP
  • SATA also support hotplug

How much of Power Management related tests were carried out

Basic power management features like suspend/resume have been validated on AM18x. We have also validated Dynamic Voltage and Frequency Scaling (DVFS) with SATA on AM18x. But on DM81xx devices, we have not done power management validation. Also, based on our analysis, if a SATA device supports power management then it would get into power save (standby) mode if there is no activity on the link.

Linux Power Management for SATA

Linux has very good PM support for ATA/SATA but predominantly on PCI master side. The platform AHCI power management support is still evolving.

Pins

Unused Pins

The TI SOC might not have brought out all the pins. Check the TRM or datasheet of the SOC for details on unused/un-supported pins.

How to handle unused pins

This varies and depends on the SOC itself. Check the pin-mux details and recommendations for the TI SOC for information on un-used pins.

If I'm not using SATA in a TI SOC, can I use those pins for other purposes

Yes, you can use it for other purposes provided the SOC supports it. This very much depends on pin muxing feature supported by the SOC. Refer pin-mux document for respective device for alternate usage.

Debugging TIPS

{{#if: These tips do not warrant for any damage caused to the device in use. Use them at your own risk|
}}Warning Warning: {{#if: These tips do not warrant for any damage caused to the device in use. Use them at your own risk|These tips do not warrant for any damage caused to the device in use. Use them at your own risk
}}

Setup works with earlier release but not working with current release

  • Check if you are using the latest and correct release.
  • Check the 'Release Notes which came with the release package for any possible "Known Issue" or "Not Supported" feature.
  • Check if "AHCI Platform" support is enabled in the Linux kernel configuration
  • Try pre-built binary from Linux PSP release (also check if pre-built binary has AHCI support enabled)
  • Post a query on [http://e2e.ti.com/] forum. Contacting the software teams directly is not recommended but for critical/production stop issues, please contact TI sales team.

Linux is not enumerating the device

  • Check if you are using the latest Linux PSP release
  • Check if "AHCI Platform" support is enabled in the Linux kernel configuration
  • Check if required SCSI support are enabled in kernel configuration
    • "SCSI disk support" -> This is required for HDD
    • "SCSI CDROM support" -> This is required for CD/DVD drives
    • "SCSI generic support" -> This is mostly required for SCSI scanners or just anything having SCSI in its name.
  • Check if the device is powered properly. You can confirm this by,
    • Checking the LEDs on the device (if they have one)
    • Touching the device (make sure you are ESD protected) and see if you could feel the spindle rotating
    • In case of CD/DVD, use the eject button to check the operation
  • Change the data cable and/or power cable
  • Try a SATA device from a different manufacturer. If this works, then note down the part which did not work and post a query on [http://e2e.ti.com/] forum.
  • If the device is not working on customer board then try the same device on a TI EVM. If it works with TI EVM then customer board might be the problem
  • You could also force the device to operate in Gen 1 speed and see if it works. Refer How do I make a HDD work at 1.5Gbps section for forcing a HDD to operate in 1.5Gbps.

Linux enumerates device but observing error messages during SATA transaction

  • This is a rare scenario.
  • The H/W (EVM or the customer board) could be the problem. Please change the boards and check again.
  • PHY configuration might not be compatible with the device under test [rare chance].

Port Multiplier is not detected

  • Check if "PMP" support is enabled in Linux kernel configuration.
  • All PMP should support command based switching by default. Please check the PMP product datasheet for any known issues.

Why the kernel is displaying too many debug messages while enumerating SATA devices

  • Check if debug messages are enabled in kernel configuration.
  • Linux PSP releases enable them in the defconfig. It is recommended to enable debug messages.

I'm not able to mount CD drives

  • Check if "SCSI CDROM support" is enabled in kernel configuration
Device Drivers  --->
   SCSI device support  ---> 
      <*> SCSI CDROM support
      [*]   Enable vendor-specific extensions (for SCSI CDROM)
  • Check if "ISO 9660 CDROM file system support" support is enabled under
Filesystems -->
  CD-ROM/DVD Filesystems --->
     <*> ISO 9660 CDROM file system support
  • Change the Media and try again

Linux not enumerating a RAID storage tower

  • Check if the RAID device is configured correctly. Use the RAID vendor provided tool for configuring RAID

My HDD supports 3Gbps but Linux works only at 1.5Gbps

  • If the device is connected via a PMP then mostly the PMP is restricting the speed
  • The PHY configuration may not be compatible with that device.

System becomes unusable during SATA transaction

  • This is mostly due to cpu load due to the SATA transactions.
  • There is no direct solution for this problem. The system architecture and usecase has to be analyzed.
  • Check if the performance benchmark numbers and cpu load published meets your usecase.
  • If the application is writing to HDD from multiple threads then try to adjust the buffer size based on trial and error. More the buffer size, better the performance (under multi-threaded scenario).

How to use PC SMPS to provide power to SATA devices

  • Usually TI EVMs come with a connector for providing power supply to SATA device. But in case if it is not available, or if you want to power more devices then a PC SMPS could be used.
  • PC SMPS usually come with SATA power connectors, if not you need to get a converter cable separately and attach the same to SMPS.
  • PC SMPS will not power-up until pins 4 & 5 of the big connector (which in PCs will be connected to the motherboard) are shorted (as shown in the diagram below).

TI-SATA-FAQ-PC-SMPS-Loopback.jpg

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} Please refer SMPS datasheet for details on loopback.

Linux PSP release worked on old EVM revision but does not work on new EVM

  • Make sure the SATA device in use works with old EVM. If it does not work with old EVM then the SATA device could be faulty. Always, use a SATA device which worked with old EVM (to debug the new EVM).
  • Check if there are any changes in the new EVM with respect to SATA/pin-muxing. To get this information, please compare the schematics or get help from the EVM team. Check with Linux PSP team to see if those hardware differences/changes require a software change.
  • If there are no differences in the board then mostly the new board is faulty. Note down the board serial number and inform the EVM team. Chances are that, only this board under debug is faulty, hence try another board (with same revision).


How to generate SATA data traffic

{{#if: |
|}}NoteNote: {{#if: |{{{1}}}
}} This section assumes that a SATA HDD is used, the device node is /dev/sda and the partition we will be using is /dev/sda1. It uses ext2 filesystem for demonstration.
  • Delete all partitions in the HDD and then create a partition in the SATA device: check fdisk help in Sitara PSP Test Setup page.
  • Create ext2 filesystem on the SATA partition
  mkfs.ext2 /dev/sda1
  • mount SATA partition to some mount point
  mount -t ext2 /dev/sda1 /media/sda1
  • Generate write traffic (1GB write traffic and write buffer size is 1MB)
  dd if=/dev/zero of=/media/sda1/test-file.bin bs=1M count=1024
  • Drop the buffer caches so that accesses go to the SATA hardware and not picked from buffer cache.
  echo 3 > /proc/sys/vm/drop_caches
  • Generate read traffic (1GB read traffic and read buffer size is 1MB)
  dd if=/media/sda1/test-file.bin of=/dev/null bs=1M count=1024
  • For continuous traffic, repeat last 3 steps in a loop

Simultaneous write/read traffic

  • In the above example, write and read are sequential, if you want them to be simultaneous then do the following
  • Create ext2 filesystem on the SATA partition
  mkfs.ext2 /dev/sda1
  • mount SATA partition to some mount point
  mount -t ext2 /dev/sda1 /media/sda1
  • Generate read file and drop caches
  dd if=/dev/zero of=/media/sda1/read-file.bin bs=1M count=1024
  echo 3 > /proc/sys/vm/drop_caches
  • Generate write/read traffic
  dd if=/dev/zero of=/media/sda1/write-file.bin bs=1M count=1024 &
  dd if=/media/sda1/read-file.bin of=/dev/null bs=1M count=1024
  echo 3 > /proc/sys/vm/drop_caches
  
  "For continuous traffic, repeat this step in a loop"

Resources