使用致钛固态硬盘经常出现磁盘变成read-only问题

Which system do you use? Android, Ubuntu, OOWOW or others?

OOWOW

Which version of system do you use? Khadas official images, self built images, or others?

Fenix 1.0.11 Ubuntu 22.04.2 LTS Linux 5.4.125

Please describe your issue below:

使用M2X 拓展,挂载致钛固态硬盘(ZHITAI TiPlus5000 512G)。使用一段时间后,固态硬盘会变成只读模式。已经从电商那边换过2次固态硬盘,均出现这种情况。

dmesg信息如下:

[244232.129541] JBD2: Detected IO errors while flushing file data on nvme0n1-8
[244246.652005] print_req_error: 38 callbacks suppressed
[244246.652021] blk_update_request: I/O error, dev nvme0n1, sector 40673392 op 0x1:(WRITE) flags 0x4000 phys_seg 98 prio class 0
[244246.652797] EXT4-fs warning: 30 callbacks suppressed
[244246.652804] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 638320640 size 7602176 starting block 5084341)
[244246.652812] buffer_io_error: 23023 callbacks suppressed
[244246.652816] Buffer I/O error on device nvme0n1, logical block 5084046
[244246.653646] Buffer I/O error on device nvme0n1, logical block 5084047
[244246.654479] Buffer I/O error on device nvme0n1, logical block 5084048
[244246.655323] Buffer I/O error on device nvme0n1, logical block 5084049
[244246.656169] Buffer I/O error on device nvme0n1, logical block 5084050
[244246.657014] Buffer I/O error on device nvme0n1, logical block 5084051
[244246.657858] Buffer I/O error on device nvme0n1, logical block 5084052
[244246.658703] Buffer I/O error on device nvme0n1, logical block 5084053
[244246.659550] Buffer I/O error on device nvme0n1, logical block 5084054
[244246.660393] Buffer I/O error on device nvme0n1, logical block 5084055
[244246.686740] blk_update_request: I/O error, dev nvme0n1, sector 40675752 op 0x1:(WRITE) flags 0x4000 phys_seg 64 prio class 0
[244246.697574] blk_update_request: I/O error, dev nvme0n1, sector 40677800 op 0x1:(WRITE) flags 0x4000 phys_seg 38 prio class 0
[244246.698354] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 638320640 size 7602176 starting block 5084864)
[244246.776794] blk_update_request: I/O error, dev nvme0n1, sector 40701952 op 0x1:(WRITE) flags 0x4000 phys_seg 63 prio class 0
[244246.777585] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 645922816 size 8388608 starting block 5088281)
[244246.842589] blk_update_request: I/O error, dev nvme0n1, sector 40709256 op 0x1:(WRITE) flags 0x4000 phys_seg 62 prio class 0
[244246.843381] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 645922816 size 8388608 starting block 5089118)
[244246.899768] blk_update_request: I/O error, dev nvme0n1, sector 40718336 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 0
[244246.902942] blk_update_request: I/O error, dev nvme0n1, sector 40719360 op 0x1:(WRITE) flags 0x0 phys_seg 24 prio class 0
[244246.903688] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 654311424 size 8388608 starting block 5089994)
[244247.017793] blk_update_request: I/O error, dev nvme0n1, sector 40911872 op 0x1:(WRITE) flags 0x4000 phys_seg 114 prio class 0
[244247.018591] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 754974720 size 8388608 starting block 5114207)
[244247.024769] blk_update_request: I/O error, dev nvme0n1, sector 40915704 op 0x1:(WRITE) flags 0x4000 phys_seg 53 prio class 0
[244247.025555] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 754974720 size 8388608 starting block 5114876)
[244247.030352] blk_update_request: I/O error, dev nvme0n1, sector 40953432 op 0x1:(WRITE) flags 0x4000 phys_seg 12 prio class 0
[244247.037329] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 771751936 size 8388608 starting block 5119589)
[244247.061069] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 805306368 size 8388608 starting block 5127751)
[244247.130066] JBD2: Detected IO errors while flushing file data on nvme0n1-8
[244252.366101] print_req_error: 3 callbacks suppressed
[244252.366110] blk_update_request: I/O error, dev nvme0n1, sector 41044024 op 0x1:(WRITE) flags 0x4000 phys_seg 97 prio class 0
[244252.366876] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 822083584 size 8388608 starting block 5130908)
[244252.366881] buffer_io_error: 4466 callbacks suppressed
[244252.366885] Buffer I/O error on device nvme0n1, logical block 5130503
[244252.367733] Buffer I/O error on device nvme0n1, logical block 5130504
[244252.368563] Buffer I/O error on device nvme0n1, logical block 5130505
[244252.369406] Buffer I/O error on device nvme0n1, logical block 5130506
[244252.370250] Buffer I/O error on device nvme0n1, logical block 5130507
[244252.371096] Buffer I/O error on device nvme0n1, logical block 5130508
[244252.371939] Buffer I/O error on device nvme0n1, logical block 5130509
[244252.372784] Buffer I/O error on device nvme0n1, logical block 5130510
[244252.373631] Buffer I/O error on device nvme0n1, logical block 5130511
[244252.374475] Buffer I/O error on device nvme0n1, logical block 5130512
[244252.375747] blk_update_request: I/O error, dev nvme0n1, sector 41049312 op 0x1:(WRITE) flags 0x4000 phys_seg 67 prio class 0
[244252.376766] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 822083584 size 8388608 starting block 5131437)
[244252.603879] blk_update_request: I/O error, dev nvme0n1, sector 41306112 op 0x1:(WRITE) flags 0x4000 phys_seg 49 prio class 0
[244252.604650] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 956301312 size 8388608 starting block 5163521)
[244252.609501] blk_update_request: I/O error, dev nvme0n1, sector 499760976 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0
[244252.616439] blk_update_request: I/O error, dev nvme0n1, sector 41314280 op 0x1:(WRITE) flags 0x4000 phys_seg 91 prio class 0
[244252.617215] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 956301312 size 8388608 starting block 5164721)
[244252.629023] blk_update_request: I/O error, dev nvme0n1, sector 41320448 op 0x1:(WRITE) flags 0x4000 phys_seg 93 prio class 0
[244252.640191] blk_update_request: I/O error, dev nvme0n1, sector 41322496 op 0x1:(WRITE) flags 0x4000 phys_seg 18 prio class 0
[244252.640970] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 964689920 size 8388608 starting block 5165654)
[244252.677881] blk_update_request: I/O error, dev nvme0n1, sector 41333368 op 0x1:(WRITE) flags 0x4000 phys_seg 41 prio class 0
[244252.678657] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 964689920 size 8388608 starting block 5167007)
[244252.683453] blk_update_request: I/O error, dev nvme0n1, sector 41336056 op 0x1:(WRITE) flags 0x0 phys_seg 62 prio class 0
[244252.684190] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 964689920 size 8388608 starting block 5167104)
[244252.696021] blk_update_request: I/O error, dev nvme0n1, sector 41338880 op 0x1:(WRITE) flags 0x0 phys_seg 69 prio class 0
[244252.696757] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 973078528 size 8388608 starting block 5167456)
[244252.754036] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 973078528 size 8388608 starting block 5169152)
[244252.757461] EXT4-fs warning (device nvme0n1): ext4_end_bio:309: I/O error 10 writing to inode 2752542 (offset 981467136 size 8388608 starting block 5169614)
[244252.973223] JBD2: Detected IO errors while flushing file data on nvme0n1-8
[244252.973288] Aborting journal on device nvme0n1-8.
[244253.243327] EXT4-fs error (device nvme0n1): ext4_journal_check_start:61: Detected aborted journal
[244253.243825] EXT4-fs (nvme0n1): Remounting filesystem read-only
[244254.138685] JBD2: Detected IO errors while flushing file data on nvme0n1-8
[244266.921085] EXT4-fs warning: 29 callbacks suppressed
[244266.921099] EXT4-fs warning (device nvme0n1): ext4_dirblock_csum_verify:372: inode #16516455: comm photoprism: No space for directory leaf checksum. Please run e2fsck -D.
[244266.921108] EXT4-fs error (device nvme0n1): __ext4_find_entry:1615: inode #16516455: comm photoprism: checksumming directory block 0
[244266.922499] EXT4-fs warning (device nvme0n1): ext4_dirblock_csum_verify:372: inode #16516455: comm photoprism: No space for directory leaf checksum. Please run e2fsck -D.
[244266.922507] EXT4-fs error (device nvme0n1): __ext4_find_entry:1615: inode #16516455: comm photoprism: checksumming directory block 0
[244267.466141] EXT4-fs warning (device nvme0n1): ext4_dirblock_csum_verify:372: inode #16516455: comm photoprism: No space for directory leaf checksum. Please run e2fsck -D.
[244267.466153] EXT4-fs error (device nvme0n1): __ext4_find_entry:1615: inode #16516455: comm photoprism: checksumming directory block 0
[244267.481104] EXT4-fs warning (device nvme0n1): ext4_dirblock_csum_verify:372: inode #16516455: comm photoprism: No space for directory leaf checksum. Please run e2fsck -D.
[244267.481115] EXT4-fs error (device nvme0n1): __ext4_find_entry:1615: inode #16516455: comm photoprism: checksumming directory block 0

补充使用smartctl 查看到的磁盘信息

➜  sudo smartctl -a /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-5.4.125] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       ZHITAI TiPlus5000 512GB
Serial Number:                      ZTA2512KAXXXXXXXXXX
Firmware Version:                   ZTA10093
PCI Vendor/Subsystem ID:            0x1e49
IEEE OUI Identifier:                0xa428b7
Total NVM Capacity:                 512,110,190,592 [512 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512,110,190,592 [512 GB]
Namespace 1 Utilization:            26,766,525,952 [26.7 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            a428b7 019d480081
Local Time is:                      Sat May 13 21:12:33 2023 CST
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Log Page Attributes (0x02):         Cmd_Eff_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     90 Celsius
Critical Comp. Temp. Threshold:     95 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.50W       -        -    0  0  0  0        0       0
 1 +     5.80W       -        -    1  1  1  1        0       0
 2 +     3.60W       -        -    2  2  2  2        0       0
 3 -   0.0500W       -        -    3  3  3  3     5000   10000
 4 -   0.0025W       -        -    4  4  4  4     8000   45000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        29 Celsius
Available Spare:                    100%
Available Spare Threshold:          1%
Percentage Used:                    0%
Data Units Read:                    8,778 [4.49 GB]
Data Units Written:                 52,203 [26.7 GB]
Host Read Commands:                 36,547
Host Write Commands:                96,821
Controller Busy Time:               1
Power Cycles:                       2
Power On Hours:                     69
Unsafe Shutdowns:                   1
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               29 Celsius
Temperature Sensor 2:               37 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

补充e2fsck检查文件系统信息

➜  ~ sudo e2fsck -n /dev/nvme0n1
e2fsck 1.46.5 (30-Dec-2021)
Warning!  /dev/nvme0n1 is in use.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/nvme0n1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (117417465, counted=117614965).
Fix? no

Free inodes count wrong (31253211, counted=31253158).
Fix? no

/dev/nvme0n1: 7461/31260672 files (0.8% non-contiguous), 7609437/125026902 blocks

使用一段时间后,固态硬盘会变成只读模式。

Do you have a heatsink on the NVMe?

From the errors it appears the NVMe is problematic.

We are using the WD Red NVMe and WD Blue NVMe, Samsung 970 NVMe and those are working fine.

Also check your power supply, make sure it is very stiff. ARM and NVMe have very high instantaneous current demands so if the regulator cannot keep up the voltage will sag.

And, make sure your USB cable is rated for 5 amps.

1、Do you have a heatsink on the NVMe?

NVMe没有独立风扇,但是VIM4主板上有散热设备。

2、Also check your power supply, make sure it is very stiff. ARM and NVMe have very high instantaneous current demands so if the regulator cannot keep up the voltage will sag.

购买的是官方的电源。

3、make sure your USB cable is rated for 5 amps.

数据线是绿联的,电流:2.1-3A。Khadas VIM4需要 5A这么大电流吗?

You have the correct power supply so that would not be an issue.
Do you have an NVMe from a different manufacture like Samsung 970 or WD.

Some problems are very difficult to troubleshoot and the fasted way is to try other stuff out.

需要 5A这么大电流吗?

Yes, all the ARM boards need a stiff supply. The NVMe is a draws a fair amount of current too.

我没用其他NVMe SSD。

有个2疑问希望能解答下:

1、看电源的参数,最大输出电流是3A。数据线是不是3A的就可以?

2、固态硬盘上标注了需要3.3V、2.5A,M2X 拓展能否提供所需的电压和电流?

You might have to get a different power supply. We use a wall plugin that is certified for medical devices, the regulation specification is very strict. Don’t have any problems with them with adapter board and NVMe.

Team Khadas might see this thread and jump in an help you out with this. They have tested different NVMe and might know the specifics regarding the TiPlus5000.

1 Like

使用最新的1.5-230425固件是否也会出现 ?

目前也没有更好的办法了,先升级下系统,再观察一段时间看看…