cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

jbartlett · July 10, 2012

I had an odd thing happen yesterday. The permission on one of my drives were messed up which prevented the files on that particular drive from showing up in the shares. I kicked off the permission fix tool from the UNRAID GUI.

It hit the drive with the bad permissions and seemed to be taking an extraordinary amount of time on it. The UNRAID GUI was unresponsive trying to pull it up in a new window. UnMenu displayed but showed that there was no disk activity on disk20 which was currently being "fixed".

I telnetted in and stopped the cache-dir script and it reported it successfully stopped. However, top still reported it was running. I was unable to kill the PID for it nor the PID for the task that was running the "find" command or the chmod command that UNRAID was running.

I commented out the script in the go file and ungracefully powered down UNRAID. Upon reboot, the permission script ran without issue. Re-enabled the cache-dir script. Reran the permission fix script and it ran without issue.

sureguy · July 10, 2012

I had an odd thing happen yesterday. The permission on one of my drives were messed up which prevented the files on that particular drive from showing up in the shares. I kicked off the permission fix tool from the UNRAID GUI.

It hit the drive with the bad permissions and seemed to be taking an extraordinary amount of time on it. The UNRAID GUI was unresponsive trying to pull it up in a new window. UnMenu displayed but showed that there was no disk activity on disk20 which was currently being "fixed".

I telnetted in and stopped the cache-dir script and it reported it successfully stopped. However, top still reported it was running. I was unable to kill the PID for it nor the PID for the task that was running the "find" command or the chmod command that UNRAID was running.

I commented out the script in the go file and ungracefully powered down UNRAID. Upon reboot, the permission script ran without issue. Re-enabled the cache-dir script. Reran the permission fix script and it ran without issue.

When you killed it did you specify any switches?

kill -9 pid

will usually kill any stubborn hung up processes

jbartlett · July 10, 2012

When you killed it did you specify any switches?

kill -9 pid

will usually kill any stubborn hung up processes

I did. kill -9 did not kill it.

Joe L. · July 10, 2012

When you killed it did you specify any switches?

kill -9 pid

will usually kill any stubborn hung up processes

I did. kill -9 did not kill it.

If a process is in a system call waiting on a kernel level interrupt, no signal can kill it. (not even kill -9) Typically, this is due to some kind of driver bug and a deadlock situation in the kernel. All you can do at that point is what to did... reboot.

NAS · July 19, 2012

Heres an interesting one. I have noticed this before on 4.7.

I have a box that I have moved a load of files about from disk share to disk share using mc. The idea was/is to merge common file that got split up due to highwater e.g. /TV/Show/S01 ended up on several disks.

However after a certain (and unknown) point of doing this cache_dirs never seems to let disks spin down. Its as if there's never enough memory to fully hold this inode data and the result is the disks never spin down again.

If I spin down the disks manually the disks spin up again. If I kill cache_dirs the disks will spin down naturally. If it reboot the box cache_dirs does it jobs and disks spin down normally.

This isnt ideal as at a certain point cache_dirs moves from being excellent to a complete hindrance.

I will keep this box up for a few hours if your about and want some data from it. After that i need to spin down cause its quite warm here today.

Cheers

Joe L. · July 19, 2012

Heres an interesting one. I have noticed this before on 4.7.

I have a box that I have moved a load of files about from disk share to disk share using mc. The idea was/is to merge common file that got split up due to highwater e.g. /TV/Show/S01 ended up on several disks.

However after a certain (and unknown) point of doing this cache_dirs never seems to let disks spin down. Its as if there's never enough memory to fully hold this inode data and the result is the disks never spin down again.

If I spin down the disks manually the disks spin up again. If I kill cache_dirs the disks will spin down naturally. If it reboot the box cache_dirs does it jobs and disks spin down normally.

This isnt ideal as at a certain point cache_dirs moves from being excellent to a complete hindrance.

I will keep this box up for a few hours if your about and want some data from it. After that i need to spin down cause its quite warm here today.

Cheers

cache_dirs has parameters to limit the number of levels of directories cached. It also has parameters to exclude specific directories.

It sounds as if you need to either:

A. install more RAM

B. limit the directory depth cached.

C. limit (include or exclude) specific user-shares to limit the directories cached.

D. use the min-time and max-time parameters to set the min and max time between"find" commands in cache_dirs.

E. modify cache_dirs as it suits YOUR needs. (It is just a shell script after all)

F. stop using cache dirs. Apparently, your limited memory and high number if directory nodes COMBINED with other use of memory on your server cause the cache_dir "find"command to take too long and end up causing the directory entries in memory to be freed to be re-used by other processes accessing the disks.

Joe L.

NAS · July 19, 2012

Easier to comment inline

It sounds as if you need to either:

A. install more RAM

[NAS] 4GB is alot already surely

B. limit the directory depth cached.

[NAS] I dont do this currently

C. limit (include or exclude) specific user-shares to limit the directories cached.

[NAS] I make heavy use of this. Only TV and movies are included

D. use the min-time and max-time parameters to set the min and max time between"find" commands in cache_dirs.

[NAS] Need to ponder this one

E. modify cache_dirs as it suits YOUR needs. (It is just a shell script after all)

[NAS] Have already

F. stop using cache dirs. Apparently, your limited memory and high number if directory nodes COMBINED with other use of memory on your server cause the cache_dir "find"command to take too long and end up causing the directory entries in memory to be freed to be re-used by other processes accessing the disks.

The last one I dont get. If i do nothing on the server for 12 hours AFTER i stop moving files around cache_dirs still is trying to keep get a handle on indexing. There are no addons of any sort other than cache_dirs and SNAP. Its as if the kernel is not dropping file caches in favour of inode caches. There should be not difference between unRAID after several hours of no access and a clean boot in terms of cache_dirs

NAS · July 19, 2012

This is intersting

root@TOWER:~# /boot/scripts/cache_dirs -q
killing cache_dirs process 2395
root@TOWER:~# /boot/scripts/cache_dirs -q
cache_dirs not currently running
root@TOWER:~# lsof | grep cache_dir
cache_dir 32681   root  cwd       DIR        0,1       0          1 /
cache_dir 32681   root  rtd       DIR        0,1       0          1 /
cache_dir 32681   root  txt       REG        0,1  678832         59 /bin/bash
cache_dir 32681   root  mem       REG        0,1   45518        605 /lib/libnss_files-2.7.so
cache_dir 32681   root  mem       REG        0,1 1575187        598 /lib/libc-2.7.so
cache_dir 32681   root  mem       REG        0,1   13474        406 /lib/libdl-2.7.so
cache_dir 32681   root  mem       REG        0,1   10280        344 /lib/libtermcap.so.2.0.8
cache_dir 32681   root  mem       REG        0,1  131493        582 /lib/ld-2.7.so
cache_dir 32681   root    0r     FIFO        0,6            2711083 pipe
cache_dir 32681   root    1u      CHR        5,1               2969 /dev/console
cache_dir 32681   root    2u      CHR        5,1               2969 /dev/console

Joe L. · July 20, 2012

This is intersting

root@TOWER:~# /boot/scripts/cache_dirs -q
killing cache_dirs process 2395
root@TOWER:~# /boot/scripts/cache_dirs -q
cache_dirs not currently running
root@TOWER:~# lsof | grep cache_dir
cache_dir 32681   root  cwd       DIR        0,1       0          1 /
cache_dir 32681   root  rtd       DIR        0,1       0          1 /
cache_dir 32681   root  txt       REG        0,1  678832         59 /bin/bash
cache_dir 32681   root  mem       REG        0,1   45518        605 /lib/libnss_files-2.7.so
cache_dir 32681   root  mem       REG        0,1 1575187        598 /lib/libc-2.7.so
cache_dir 32681   root  mem       REG        0,1   13474        406 /lib/libdl-2.7.so
cache_dir 32681   root  mem       REG        0,1   10280        344 /lib/libtermcap.so.2.0.8
cache_dir 32681   root  mem       REG        0,1  131493        582 /lib/ld-2.7.so
cache_dir 32681   root    0r     FIFO        0,6            2711083 pipe
cache_dir 32681   root    1u      CHR        5,1               2969 /dev/console
cache_dir 32681   root    2u      CHR        5,1               2969 /dev/console

does it show up in a process list? Remember, there are background processes performing the "find" commands. It will not completely self-terminate until it finishes the "find" it is currently performing.

Type

ps -ef | grep cache_dirs

to see if it is still running. (the -q command indicated it killed process id 2395.) lsof seems to indicate there is a process id 32681still active.

NAS · July 20, 2012

Joe,

sorry I had to reboot to ensure that cache_dir started working properly again and disk spun down before bed.

After 16 hours of little to no activity other than cache_dir the disks were still up. I rebooted and after cache_dirs had completed the disks spun down. This morning I moved 10GB of data via the network onto unRAID and as predicted the disks spun down.

It seems that if you move a load of data using mc from disk share to disk share at some point the system gets into a state where cache_dirs can never complete.

One scenario could be that cache_dirs (with my exclusions) has enough memory to complete. However by moving stuff around and cache pressure set to zero the inode count may be greater than the memory available. The result could be that the combination of cache pressure and the increased inode count caused by the moves puts us into an unwinable state.

I dont expect this is a RAM limit as the math doesn't add up with 4GB of RAM but i could imagine some other kernel inode limit.

I am not convinced by my theory here as it assumes that theres a kernel issue that someone far more extreme than me would have seen before.

Joe L. · July 20, 2012

Joe,

sorry I had to reboot to ensure that cache_dir started working properly again and disk spun down before bed.

After 16 hours of little to no activity other than cache_dir the disks were still up. I rebooted and after cache_dirs had completed the disks spun down. This morning I moved 10GB of data via the network onto unRAID and as predicted the disks spun down.

It seems that if you move a load of data using mc from disk share to disk share at some point the system gets into a state where cache_dirs can never complete.

One scenario could be that cache_dirs (with my exclusions) has enough memory to complete. However by moving stuff around and cache pressure set to zero the inode count may be greater than the memory available. The result could be that the combination of cache pressure and the increased inode count caused by the moves puts us into an unwinable state.

I dont expect this is a RAM limit as the math doesn't add up with 4GB of RAM but i could imagine some other kernel inode limit.

I am not convinced by my theory here as it assumes that theres a kernel issue that someone far more extreme than me would have seen before.

I would not set cache_pressure to zero. I would in fact, set it closer to 100 as an experiment to see if it helps you. Obviously, we cannot control inodes cached vs. data blocks. (it would be nice, but that detailed control is not there0

Joe L.

NAS · July 20, 2012

Experimenting with all the options now.

Its crazy that this isnt a proper kernel tunable. I wonder how many disk hours are wasted for the sake of a few MB of ram globally per year. I suspect a huge amount

kizer · September 28, 2012

I'm not sure but since I've switched to v5.0-beta6 cache_dir seems to scan everything and then stop working. Of course I'm just discovering this, but lately my drives keep spinning up and I'm only a folder deep. When I do a cache_dir -q it says there is no processing running. Kinda odd since I started it manually a day or so ago and it is in my go script /boot/scripts/cache_dir -w, which worked perfectly in 4.7

I haven't done much testing to prove this, but something seems to be off a bit for at least me. LOL

Edit.............................

Disk4 just finished scanning and then I spun the drive down. I connected to it via \\tower\disk4\ on my windows7 machine and as soon as I was one folder deep it spun up. I then forced the drive to spin down. Clicked around in a few folders and as soon as I was 2 folders deep and hit some actual files it spun backup again. I do have show icons not thumbnails selected.

papnikol · November 2, 2012

Joe,

sorry I had to reboot to ensure that cache_dir started working properly again and disk spun down before bed.

After 16 hours of little to no activity other than cache_dir the disks were still up. I rebooted and after cache_dirs had completed the disks spun down. This morning I moved 10GB of data via the network onto unRAID and as predicted the disks spun down.

It seems that if you move a load of data using mc from disk share to disk share at some point the system gets into a state where cache_dirs can never complete.

One scenario could be that cache_dirs (with my exclusions) has enough memory to complete. However by moving stuff around and cache pressure set to zero the inode count may be greater than the memory available. The result could be that the combination of cache pressure and the increased inode count caused by the moves puts us into an unwinable state.

I dont expect this is a RAM limit as the math doesn't add up with 4GB of RAM but i could imagine some other kernel inode limit.

I am not convinced by my theory here as it assumes that theres a kernel issue that someone far more extreme than me would have seen before.

Hi NAS

I was wondering if you have found a solution. I seem to have a similar problem with you. After I used mc, my previously funtional cache_dirs command:

cache_dirs -d 2 -m 3 -M 4 -w -F

never stops. I even tried to decrease the depth but to no avail. I wonder if you have found a solution.

PS: I have 2.5GB of ram (and 1.5 GB seem to be free) and i am using cache_dirs 1.6.5.

i also get this:

Executed find in 838.856452 seconds, weighted avg=841.972656 seconds, now sleeping 4 seconds

which I know is normal but I thought it was supposed to start after cache_dirs has fully scanned my disks

Gizmotoy · November 6, 2012

I'm running cache_dirs -w from my go script on boot. I haven't had any trouble with it, but today I got the following error in my syslog while preclearing my first 3TB drive. I'm still running 5.0RC2 for now.

Nov  6 04:37:23 Hyperion kernel: Pid: 11795, comm: cache_dirs Not tainted 3.0.30-unRAID #4 (Errors)
Nov  6 04:37:23 Hyperion kernel: Call Trace: (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1028a84>] warn_slowpath_common+0x65/0x7a (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c101636e>] ? default_send_IPI_mask_logical+0x2f/0xb9 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1028afd>] warn_slowpath_fmt+0x26/0x2a (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c101636e>] default_send_IPI_mask_logical+0x2f/0xb9 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1014b93>] native_send_call_func_ipi+0x4c/0x4e (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1049a53>] smp_call_function_many+0x18c/0x1a4 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c105ee4a>] ? page_alloc_cpu_notify+0x2d/0x2d (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c105ee4a>] ? page_alloc_cpu_notify+0x2d/0x2d (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1049a85>] smp_call_function+0x1a/0x1e (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1049a9b>] on_each_cpu+0x12/0x27 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c105fcbe>] drain_all_pages+0x14/0x16 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c106031e>] __alloc_pages_nodemask+0x371/0x47f (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1026fd4>] dup_task_struct+0x46/0x119 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c10279ef>] copy_process+0x70/0x9f6 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1028443>] do_fork+0xce/0x1e6 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1032e31>] ? set_current_blocked+0x27/0x38 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c1032f67>] ? sigprocmask+0x7e/0x89 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c100819b>] sys_clone+0x1b/0x20 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c130f59d>] ptregs_clone+0x15/0x38 (Errors)
Nov  6 04:37:23 Hyperion kernel:  [<c130ec65>] ? syscall_call+0x7/0xb (Errors)

Any ideas why it crashed or if it's something I should be worried about?

syslog-2012-11-06.txt.zip

Joe L. · November 6, 2012

basically, you ran out of free memory.

Apparently, on the 3TB drives you will need to use the -r, -w, and -b options to limit the memory used by the preclear process.

This is because those parameters were originally sized when 1 TB drives were common. At that time the preclear script was designed to read and write a cylinder at a time. With larger disks, their geometry has gotten large enough that it may not leave enough memory for other processes.

Try something like this:

preclear_disk.sh -w 65536 -r 65536 -b 200 /dev/sdX

Gizmotoy · November 6, 2012

basically, you ran out of free memory.

Apparently, on the 3TB drives you will need to use the -r, -w, and -b options to limit the memory used by the preclear process.

This is because those parameters were originally sized when 1 TB drives were common. At that time the preclear script was designed to read and write a cylinder at a time. With larger disks, their geometry has gotten large enough that it may not leave enough memory for other processes.

Try something like this:

preclear_disk.sh -w 65536 -r 65536 -b 200 /dev/sdX

Hmm. That's unexpected. I have 4GB of RAM, but I suppose I am running quite a few apps/plugins (CrashPlan, Apache, mySQL, SickBeard, Sabnzbd, and a few others). I'll check my free memory when I get home since I can't access it remotely. Those plugins combined with the preclear sounds like it could be the issue, so I'm sure you're right. The array's been rock solid for months and this is the first time I've ever had a cache_dirs problem, or any memory-related problem, so I'm not surprised they're related. I guess I lucked out that it killed cache_dirs instead of something more important like the webGUI.

I'll wait for this preclear to finish, reboot the server, and then use the above command when I run my final 2 cycles (I usually run 1 initial preclear cycle, then follow it up with 2 more if the drive passes).

Thanks for the info. I'll let you know how it works out.

UhClem · November 6, 2012

basically, you ran out of free memory.

Apparently, on the 3TB drives you will need to use the -r, -w, and -b options to limit the memory used by the preclear process.

This is because those parameters were originally sized when 1 TB drives were common. At that time the preclear script was designed to read and write a ==>cylinder<== at a time.

Cylinder? As a design criterion (such as you allude), the cylinder has been obsolete for ~20 years, even moreso in the last 5-8 years. Just change your default to something that "feels" right. And, totally forget about "disk geometry"--all that does for you is create a chunk size that is NOT a multiple of 4K.

--UhClem "The times they are a'changing."

Joe L. · November 6, 2012

basically, you ran out of free memory.

Apparently, on the 3TB drives you will need to use the -r, -w, and -b options to limit the memory used by the preclear process.

This is because those parameters were originally sized when 1 TB drives were common. At that time the preclear script was designed to read and write a ==>cylinder<== at a time.

Cylinder? As a design criterion (such as you allude), the cylinder has been obsolete for ~20 years, even moreso in the last 5-8 years. Just change your default to something that "feels" right. And, totally forget about "disk geometry"--all that does for you is create a chunk size that is NOT a multiple of 4K.

--UhClem "The times they are a'changing."

I fully understand that "cylinders" have not been used for 20 years (probably much more)

The issue I was faced when originally writing the preclear script was in selecting an appropriate "block size" when reading and writing to the disks. I used the output of the "fdisk" command as my guide. I figured the disk geometry would probably report a size it could handle.

"fdisk" presented a line like this (from a sample IDE disk on one of my servers):

Units = cylinders of 16065 * 512 = 8225280 bytes

The preclear script then read, by default, 200 "units" of data at a time with a "dd" command looking something like this (for that disk)

dd if=/dev/sdX bs=8225280 count=200 .........

The amount of memory used for a single read request was then 8225280 * 200 = 1,645,056,000 bytes.

Now, with larger 3TB disks, and a much larger different "Unit" you can easily run out of memory. The use of Units worked for many years, with disk sizes from 6 Gig upwards. It is only now with 3TB drives are the sizes out-growing the available RAM.

I agree, there needs to be a limit, but a multiple of 4k makes no practical difference at all when you are asking for 8225280 bytes or more at a time.

In the interim, use the "-r" and "-w" options as I previously indicated, and you'll probably not run out of memory.

Joe L.

UhClem · November 6, 2012

I fully understand that "cylinders" have not been used for 20 years (probably much more)

The issue I was faced when originally writing the preclear script was in selecting an appropriate "block size" when reading and writing to the disks. I used the output of the "fdisk" command as my guide. I figured the disk geometry would probably report a size it could handle.

No. (In the chronological context of this forum) Totally forget about (disregard!) DISK GEOMETRY.

"fdisk" presented a line like this (from a sample IDE disk on one of my servers):

Units = cylinders of 16065 * 512 = 8225280 bytes

The preclear script then read, by default, 200 "units" of data at a time with a "dd" command looking something like this (for that disk)

dd if=/dev/sdX bs=8225280 count=200 .........

The amount of memory used for a single read request was then 8225280 * 200 = 1,645,056,000 bytes.

No. dd read 8225280 bytes at a time. (the entire dd run consisted of 200 such reads). Each read() (re-)used the same ~8MB (user-space) buffer.

Now, with larger 3TB disks, and a much larger different "Unit" you can easily run out of memory.

I don't have any >2TB drives, so I don't know how fdisk -l reports them, but if you do get a much larger "Unit", you have no grounds for complaint. Instead, consider yourself fortunate that you didn't get bit earlier (for following the folly of disk geometry).

I agree, there needs to be a limit, but a multiple of 4k makes no practical difference at all when you are asking for 8225280 bytes or more at a time.

Practical difference or not, it is just good practice to use the same basic unit (and multiples thereof) as the OS. (it's a corollary of the Law of least surprises )

Joe L. · November 7, 2012

Practical difference or not, it is just good practice to use the same basic unit (and multiples thereof) as the OS. (it's a corollary of the Law of least surprises )

Granted. I was younger and far more innocent when I wrote the preclear utility.

FYI, I first used "dd" many many years before Linux was created. It has not changed much over the years...

( It was on version 1.0 of CB-Unix. (barely out of Bell-Labs)) I knew it issued reads to the OS sized at the block size. Never gave much thought to the "count" and its buffering prior to output. I'll have to look at linux source to see what it does these days. Regardless of what "dd" is doing, the disk buffer cache will be keeping much of what it had recently accessed simple because it was most recently accessed.

In the same way, cache_dirs is just using the "find" command, and it will force a lot of the disk buffer cache to be involved if you have a deep/extensive directory hierarchy. Between them, you can run out of "low" memory.

Joe L.

UhClem · November 7, 2012

FYI, I first used "dd" many many years before Linux was created. It has not changed much over the years...

( It was on version 1.0 of CB-Unix. (barely out of Bell-Labs))

Linux ... feh!! ...

FYI, I first used, and did kernel development on, Unix before it even had dd (v4, 1973).

The important point about dd, as it pertains to this discussion, is that the bs= and count= options have not changed. And, the (general) lesson to be learned by all (self included) is that it is important to read (and comprehend) the man pages for commands we hope to use correctly and constructively.

I knew it issued reads to the OS sized at the block size. Never gave much thought to the "count" and its buffering prior to output. I'll have to look at linux source to see what it does these days.

The description of the count= option has not changed in its entire 38+ year lifetime (except in a negligibly semantic sense):

May 1974: copy only n input records

August 2012: copy only N input blocks

The fact that it is a copy precludes any concern about buffering.

Regardless of what "dd" is doing, the disk buffer cache will be keeping much of what it had recently accessed simple because it was most recently accessed.

It isn't really keeping it (in an active sense); it has just not yet overwritten it with anything else. Regardless, this can not be the source (nor excuse/explanation) for any shortage of user-space memory. [Yes, a perverse, and privileged, user can "manufacture" a problem by setting excessively agressive memory tuning parameters. If so ... "you make your bed, you have to lie in it."]

In the same way, cache_dirs is just using the "find" command, and it will force a lot of the disk buffer cache to be involved if you have a deep/extensive directory hierarchy. Between them, you can run out of "low" memory.

No, I don't believe it really is "in the same way". In the dd example, only the buffer cache is in play, and in a very simple/straightforward manner. In the case of your find/cache_dirs example, there is likely some "race condition" provoked by an interplay of the buffer cache, the inode cache, and the directory-entry cache. If you can really cause an error condition this way, then it is a system bug (technically). [but nobody is both willing and able to fix it. (You know, like the national budget problem )]

--UhClem "(readme) (doctor memory)"

Gizmotoy · November 8, 2012

Not to derail the lively discussion here, but for anyone stumbling on this I just wanted to mention that the suggested parameters for preclearing did in fact solve the issue. I'm on my second preclear cycle without a complaint from the server. It does appear to be taking quite a bit longer, though.

Joe L. · November 8, 2012

FYI, I first used "dd" many many years before Linux was created. It has not changed much over the years...

( It was on version 1.0 of CB-Unix. (barely out of Bell-Labs))

Linux ... feh!! ...

FYI, I first used, and did kernel development on, Unix before it even had dd (v4, 1973).

Were you one of those who used "adb -w" on the kernel while loaded in memory? I fully respect anyone with that skill.

The important point about dd, as it pertains to this discussion, is that the bs= and count= options have not changed. And, the (general) lesson to be learned by all (self included) is that it is important to read (and comprehend) the man pages for commands we hope to use correctly and constructively.

I knew it issued reads to the OS sized at the block size. Never gave much thought to the "count" and its buffering prior to output. I'll have to look at linux source to see what it does these days.

The description of the count= option has not changed in its entire 38+ year lifetime (except in a negligibly semantic sense):

May 1974: copy only n input records

August 2012: copy only N input blocks

The fact that it is a copy precludes any concern about buffering.

Regardless of what "dd" is doing, the disk buffer cache will be keeping much of what it had recently accessed simple because it was most recently accessed.

It isn't really keeping it (in an active sense); it has just not yet overwritten it with anything else.

True. I understand the concepts involved. My early involvement with computers was roughly about the same time as you, but I was fixing them, at the hardware level, and running hand-coded machine code routines to test them. There was no "motherboard" back then on the TSPS system I was working... It was all DTL logic. (ICs were not yet in common use) My involvement with UNIX did not begin until 1979/1980. It was a version of PWB-Unix... and prior to the Borne shell. (Its "Mashey" shell actually had labels and "goto")

Years later (late 80s) I also was involved in writing custom kernel level "device driver" code in SVR3 UNIX (on a 3B2) for a very special interface to hardware manufactured by a supplier of "smart-phones" back when I was working on project for AT&T. Their customer at that time had specified the hardware, and I needed to communicate with it, and it was not possible through any existing interface. I wrote the device driver.

It was years later that I first ran Linux, and re-wrote the scsi/sound-card driver on it to work on my hardware. (So I could play "doom" )

No, I don't believe it really is "in the same way". In the dd example, only the buffer cache is in play, and in a very simple/straightforward manner. In the case of your find/cache_dirs example, there is likely some "race condition" provoked by an interplay of the buffer cache, the inode cache, and the directory-entry cache. If you can really cause an error condition this way, then it is a system bug (technically). [but nobody is both willing and able to fix it. (You know, like the national budget problem )]

--UhClem "(readme) (doctor memory)"

I'll agree with you there. (on both.. the Politicians

, and the kernel developers) I have a feeling, like you, that one of the cache systems is not configured to be able to handle a large traversal of files using "find" at the same time performing a "dd" of zeros to an entire disk. Since the conditions needed to experience the issue are rare, nobody in the kernel-dev team have fixed it (if they even know of it) The issue seems to be related to "low memory" exhaustion. (a concept we never had to worry about on true UNIX with a single linear address space, and swap space available if memory was insufficient.) It seems to not occur if smaller block-sizes are used with the "dd" command. I never encounter "low memory" issues, since all I have on both of my servers is "low" memory. 512 Meg on one, 4gig on the other)

As you said, the "dd" operation does not involve anything but the disk-buffer cache, but running it concurrently with a "find" of a large hierarchy certainly does involve the dentry and inode cache systems. I'm sure the user-share file-system is complicating the issue, it being entirely in memory.

In an ideal world, we would not need to deal with any of this. Linux has its quirks when allocating memory for cache and processes, even more in SMP environments. I suspect it only shows on some hardware/drivers. As Linux/unRAID changes, we'll just have to adapt to the environment. 5 years ago I never had to think about concurrently writing using large buffer sizes to multiple 3TB disks.

I guess I will eventually just change the preclear script to use a fixed buffer size, and hope it will work in all situations.

Joe L.

PS.

Great to see another old-time-Unix-geek on here.

UhClem · November 8, 2012

[To you "youngsters": Please allow us two old farts to reminisce; maybe it'll be of some interest to a few of you.]

Before I respond, allow me to put this in a chronological perspective (using a human lifecycle): I'd say that today, Unix is solidly in middle-age; it's healthy, accomplished, and has a long future ahead. The PWB/Unix that you referred to was first released in 1976, which I would characterize as Unix's early puberty; the time period you went hands-on, 1979-80, is getting close to its adolescence. When I first had hands-on with Unix and the kernel was Nov, 1974; that was the first release of the code from Western Electric (aka AT&T). Seven sets of tapes were mailed from Murray Hill; one came to me (on behalf of my employer). I would call that Unix's early weaning .

Linux ... feh!! ...

FYI, I first used, and did kernel development on, Unix before it even had dd (v4, 1973).

Were you one of those who used "adb -w" on the kernel while loaded in memory? I fully respect anyone with that skill.

adb? That's like solid food [cf: weaning]; there was no adb. There was only db, and truthfully, it is a misnomer to call db a debugger. Its limited functionality was to examine a (binary) executable (pre-execution), or to analyze a core-dump (post-execution). It could do nothing to help you in debugging a running program, or (obviously) a live kernel. [i will always remember my phone call to Ken Thompson, asking him where the (real) debugger was. When he told me there was none, I was honestly incredulous. I instinctively asked him "How did you and Dennis get all of this stuff working?" His answer: "We read each other's code."] And that is part of what identifies true software deities.

But, as a mere mortal, I still needed to debug my modified kernel. So, I hacked the header of DEC's ODT-11 (octal debugging tool for PDP-11) so that it was compatible with Unix ld (the [link-]loader). Adding this odt11.o to the rest of the kernel object files, ld gave me a kernel that I could debug, albeit painfully (no symbols--constantly referring to a nm listing for a symbol map). I believe I was the first person to actually debug (breakpoints, single-step, etc.) on Unix (kernel or user-space).

Not to evade your question , but by the time adb came around (1976-77), I was winding down my kernel hacking. [i guess I don't get your respect, huh? :)]

Oh my, time-traveling is tiring ... this geezer has to take a nap.

I'll follow-up the on-topic discussion of this (computer) memory exhaustion issue a little later.

--UhClem "Forward! Into the past... back to the shadows, again..."

cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Alex R. Berg

Alex R. Berg

Alex R. Berg

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation