If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Why is this folder so slow?
On 4/26/2020 6:46 PM, Paul wrote:
Yousuf Khan wrote: I have a folder on one of my SSD drives that takes 8 to 10 hours to back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space altogether, and there are 580,000 files here. Indicates that per file it's using up a little bit over half of a cluster on average. File system is NTFS. Meanwhile, this same drive can backup the remainder of the drive in under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this inefficient for small files like this? Yousuf Khan Have you tried to "defragment" the drive ? Since the folder is on an SSD, fragmentation shouldn't make any difference. -- Ken |
#12
|
|||
|
|||
Why is this folder so slow?
Yousuf Khan wrote:
[...] VSS is used on all of the backup jobs. None of the others exhibit this behaviour. In fact, I've experienced this issue for nearly a decade now. The problem started on Windows XP, continued on into Windows 7, and continues to plague me in Windows 10. This particular folder has also been migrated around from HDD to SSD, to a 2nd SSD, etc. So it's not a problem that is specific to HDD's or SSD's, or to any particular version of Windows. I'll tell you what this folder is. It's actually my Thunderbird News folder (exactly what I'm using to ask this question here), which exists under the my User folder structure. The problem was discovered when I started doing daily backups of my User folder and discovered that the User folder was taking forever. After investigating it some, I figured out that the problem was this particular substructure under News. Once I excluded the News folder, backups finished 6 times faster! So I moved the backups of the News folder to their own job, and let the rest of the User folder get backed up separately. Before, you ask, I only backup the News folder once a week, but it's still a pain in the ass watching it take so long even once a week. If there are 580,000 files in the News folder, then you've probably configured your Thunderbird News account(s) to use one file for each article instead of one file for each newsgroup. If so, it's probably best to bite the bullet and convert to one file per newsgroup. That probably needs an export and (re-)import and probably will be time-consuming, but at least then you'll solve the actual problem. FYI, my setup - not Thunderbird - has nearly a million articles, but only some 600 files. Some other background. When this particular backup is happening, it's not the drives that are showing as busy, it's the CPU cores! 4 out of the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy, whole the other 4 are not that busy. My guess it that this processing is spent getting the hundreds of thousands of files into and out of the file system cache. Yousuf Khan |
#13
|
|||
|
|||
Why is this folder so slow?
On 27/04/2020 02:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space altogether, and there are 580,000 files here. Indicates that per file it's using up a little bit over half of a cluster on average. File system is NTFS. Meanwhile, this same drive can backup the remainder of the drive in under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this inefficient for small files like this? NTFS is pretty efficient but some users of Windows 10 machine aren't.Â* Also, your machine must be showing signs of suspicious activities so the backup program must scan it to see if there are any imminent threat to the society in general by your sordid activities. -- With over 1.2 billion devices now running Windows 10, customer satisfaction is higher than any previous version of windows. |
#14
|
|||
|
|||
Why is this folder so slow?
Yousuf Khan wrote:
On 4/26/2020 9:32 PM, VanguardLH wrote: Using WHAT backup software? Doing a file-based or image-based backup? Macrium, file-based. Is it a direct access to the folder, or are you using a redirection, like a junction (reparse point)? Does that folder itself have any redirections which could run the backup program into a loop if it doesn't specifically ignore those? No, none of that. Straightforward unredirected. What did you use to check if there were junctions defined within the folder? For example, you could use Nirsoft's NTFSLinksView tool to scan for junctions to list them. You can specify the start folder from where to search, like the folder with the 500K+ files, or search from the root folder of a drive (junctions cannot point to other drives). Alas, if you pick the problematic folder, a scan will only show any junctions in that folder, not those that point at that folder. You might want to scan from the root folder, and then check if that folder is under a junction. Windows has been using junctions for a long time, especially when Microsoft decides to change the name of the special folder, like changing "Documents and Settings", the old name, and "Documents", that both point to C:\Users. Could be your problematic folder is under a junction, like Documents. https://knowledgebase.macrium.com/pa...ageId=23397420 That gives some information. As I recall, Macrium is supposed to ignore symlinks and junctions when creating backups (to prevent looping). That is, it still records the reparse points, but it shouldn't follow them. Make damn sure that Macrium Reflect is *NOT* following reparse points (recording them is okay, but following them during a backup is usually not okay). Go into Reflect under its Other Tasks menu to select Edit Defaults. Under the Backup tab, and under the Reparse Points category, make sure "System - Do not follow" is selected. However, the default for User Reparse Points is to follow them, but I've seen users screw them up and generate circular links. See what happens when you set "User - Do not follow". Those are for the default settings used when you /create/ a backup job. For old saved job definitions, they may differ than the current global defaults. Also go into the backup job's definition and set the reparse follow options the same ("Do not follow" for both system and user defined reparse points). You could run a test by moving or copying the problematic folder to elsewhere that is guaranteed not to be under a junction (after first checking the folder itself has no junctions), like copying the folder to C:\problemfolder, and then having Reflect backup just that folder. Are the files in the problematic folder in use? If open for write, another process has to either wait for the file handle to close (get deleted) or times out. Although I also use Macrium Reflect, configuring it to run pre- and post-job commands is *very* clumsy. You have to create a Powershell, VBscript, or batch file and have Macrium run that as its scheduled task. Once you create the script template, you edit it to add your own commands before or after the backup job. The problem that I've run into is that Reflect will have the script run the backup job by calling Reflect as a service which has admin privileges, but doesn't load the command shell itself with admin privs in which the script runs, so commands you enter there that require admin privs won't run. There might be a way around that, but I gave up on Reflect's clumsy pre- and post-command workaround feature, plus you have to maintain the script instead of having an easily configurable command line to edit in the Reflect GUI when creating or editing a backup job. However, if you can get Reflect's script feature to work to emulate a pre- and post-job feature, you might look at running the SysInternals' handle.exe command to see which files might be in-use (have open file handles) before the backup job starts. Getting locked out from reading a file can be thwarted by using VSS (Volume Shadow Service). I'm pretty sure on image backups that Reflect defaults to using VSS. I don't see an option to not use VSS. However, under Other Tasks menu, Advanced tab, check if Reflect will "Automatically retry without VSS writers on failure". If there is a problem with VSS, Reflect will try to backup without VSS. Also check the VSS service will change into Running status. Go into Windows services (services.msc), scroll down to "Volume Shadow Copy" service. It should be set to Manual startup mode, and not Disabled. It runs when called. It does not stay running during the entire time that Windows is running. It is only needed when a shadow copy is needed to get at in-use or system-restricted files, and you are not backing up the entire time you have Windows loaded. If you go into Event Viewer, Application logs, and filter on event ID 8224, you'll see informational events for "The VSS service is shutting down due to idle timeout." I forget the idle interval, probably 15 minutes, but once started the VSS service will eventually stop after the last time it got called by a VSS requestor and after a VSS writer has completed its task. Those users that whine the VSS has idle-stopped don't understand this service is not meant to be always running (Automatic mode). It is manually called by a requestor, used for a while, and then it stops because it's not being used anymore. Been that way since Microsoft introduced VSS back in Windows XP to facilitate backing up of in-use and system files. https://docs.microsoft.com/en-us/win...w-copy-service Right click on that service and select Start, or select it and click the Start button. Did it change into Running status (for awhile)? Some programs install their own VSS writers. As I recall, Paragon supplied their own optional VSS writer you could select instead of using the Windows-provided one. Reflect uses the copy-on-write writer already provided by Windows. You can see a list of VSS writers by running in a command shell: vssadmin list writers Sorry, I haven't delved far enough into this to know which system VSS writer that Reflect will employ. Might be the ASR Writer as noted at https://docs.microsoft.com/en-us/win...ox-vss-writers. Not sure even Reflect cares, as it likely just issues some system API call to use VSS. VSS is only usable when NTFS is used as the file system. You didn't mention WHERE is the problematic folder. If it is a folder on an internal drive that uses NTFS, VSS can come into play (if the targeted files are locked). If the folder is on some external storage media, like a USB HDD or flash drive, could be that uses FAT32 or some other file system than NTFS, so VSS can't be used there. If VSS fails when called by Macrium Reflect, the backup job's log should note the error. See: https://knowledgebase.macrium.com/di...oft+VSS+errors |
#15
|
|||
|
|||
Why is this folder so slow?
Yousuf Khan wrote:
On 4/27/2020 3:57 AM, Paul wrote: I seem to remember at some time in the past, you offered advice on putting an exception for an AV program, so it does not scan that particular directory (something in Thunderbird). If your CPU cores are railed, I'd be tracing down the PID of the offender. One way to do it on a Pro SKU of OS, is tasklist /svc # should not work on Home Not even necessary, I can tell you right now which process is responsible, it's the Macrium Reflect binary. Also the System process which I assume the Reflect binary also makes heavy use of during this time. OK, show me a chunk of nfi.exe output, just for files in the magical folder. Just enough to capture the essence of what's going on. nfi.exe is in here (13,529,558 bytes) https://web.archive.org/web/20070104...s/oem3sr2s.zip Run nfi.exe C: c_nfi.txt This is what a file looks like, followed by a directory. A directory has a $I30 entry in it. File 5468 \YOUTUBE_CAP\out_linux_ffmpeg2.avi $STANDARD_INFORMATION (resident) $FILE_NAME (resident) $DATA (nonresident) logical sectors 2576342736-2577800527 (0x998fded0-0x99a61d4f) File 5463 \YOUTUBE_CAP $STANDARD_INFORMATION (resident) $FILE_NAME (resident) $INDEX_ROOT $I30 (resident) $INDEX_ALLOCATION $I30 (nonresident) logical sectors 2577800616-2577800623 (0x99a61da8-0x99a61daf) $BITMAP $I30 (resident) What we're looking for here, is something like an extended attribute. You might also use fsutil, and verify the cluster size (4KB default). Windows 10 stopped tolerating non-default cluster sizes on C: about three OSes ago, so it pretty well has to be 4KB now on cluster size. One reason I want some info about your 800,000 file folder, is I want to see if there are no logical sectors (small files, like 1KB files, fit within $MFT and don't use clusters for the data storage). Or I was io see if the clusters are fragmented. One other thing Windows 10 does now, is they added a small write cache (per handle). The write cache has "ruined" the notion of fragmentation, in the sense that no fragment can be 4KB. The buffer is 64KB. If a file fragments today in Windows 10, the chunk size should be 64KB. I use the Passmark fragment generator, to create fragmented files for test. I noticed that if the Passmark fragment generator is run on modern Windows 10, the fragments don't seem to be any smaller than 64KB. If I run under an older OS, you can see on the screen (JKDefrag) that the fragments are finer. I do these tests on a RAMDisk so no harm comes to any physical storage devices. You might ask "I have a 4KB file to store, what happens with the 64KB buffer in that case". I don't know. Obviously it cannot break, or we'd have heard about it by now. The buffer must flush when the handle closes. I only mention this new feature, in case you examine your 800,000 files and notice there's no fragmentation at all. Paul |
#16
|
|||
|
|||
Why is this folder so slow?
Yousuf Khan wrote:
I'll tell you what this folder is. It's actually my Thunderbird News folder (exactly what I'm using to ask this question here), which exists under the my User folder structure. The problem was discovered when I started doing daily backups of my User folder and discovered that the User folder was taking forever. After investigating it some, I figured out that the problem was this particular substructure under News. Once I excluded the News folder, backups finished 6 times faster! So I moved the backups of the News folder to their own job, and let the rest of the User folder get backed up separately. Before, you ask, I only backup the News folder once a week, but it's still a pain in the ass watching it take so long even once a week. Some other background. When this particular backup is happening, it's not the drives that are showing as busy, it's the CPU cores! 4 out of the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy, while the other 4 are not that busy. Yousuf Khan As a test, disable your anti-virus software and run your TB data-only backup job. As another test, make sure to *exit* Thunderbird (check there are no instances of TB in Task Manager's Processes tab), and check if the backup job is just as slow. Do you leave TB running all the time? Does the backup job run as a scheduled event at a time after you would've unloaded TB, like you use TB during the day (say 8AM to 11 PM), unload it when done, and you schedule the backup job to run early morning (say 4 AM)? VSS will encounter problems with databases that are not VSS aware. Microsoft's SQL Server is VSS aware, but others are not. The recommendation in backup programs, even those using VSS, for database programs that are not VSS aware is to schedule their shutdown before the backup, schedule the backup while the database program is down, and restart the database program after the backup finishes. While this can be done using Task Scheduler using event triggers (provided the database program issues an event on shutdown), it's a pain to figure out the script-like code you have to use to define for the trigger of the scheduled event. There are schedulers that are more flexible that can make their events dependent: task 3 runs only after task 2 ran and returned good status which runs only after task 1 completed and returned good status. https://knowledgebase.macrium.com/di...ware+databases I sincerely doubt Thunderbird provides its own VSS writer. What does Tbird use to manage its message store? Isn't it SQLite? SQLite is not a VSS-aware database program. In fact, it isn't a database program at all. It's a library from which some program can call its functions (aka methods). It would be up to the calling program to be VSS-aware, and I doubt Mozilla ever added that to Tbird. http://sqlite.1065341.n5.nabble.com/...r-td85887.html I remember back when using MS Outlook with POP which stored its message store in a PST file that backups would often skip that database. While Outlook was running, its database couldn't be backed up because it wasn't only in-use but also locked as a database. MS didn't provide a VSS writer just for Outlook. Some users used batch files that would kill Outlook, run the backup (to include Outlook's message store), and reload Outlook after the backup finished. However, Outlook has no way to gracefully unload it. There is no command-line switch for Outlook to ask it to unload. You had to kill it, and that's always a bad way to smash a program with open files since corruption can occur to the files. Some backup programs worked around the problem by installing an extension into Outlook that would exit it and start the backup program, and the backup program would later restart Outlook. I'm sure there were other workarounds. Since Outlook is a client, not a server, there really was no need to leave it running 24x7, but a lot of users ran it that way, so it available upon their return to their computer. Not all programs that manage a database are VSS-aware. Usually the easiest solution is to make sure the program using the database is not running at the time of the backup job. Does Tbird have a command-line switch that will unload the currently loaded instance(s) of Tbird? Using taskkill.exe is abrupt and can result in file corruption. If Tbird can be requested to gracefully shutdown, you could do that in a script, run the backup job, and reload Tbird (if you can get scripts via Powershell, VBscript, or batch to work in Reflect). I doubt Tbird generates an event when it exits (i.e., you don't see anything in Event Viewer). If it does, you can define a scheduled event in Task Scheduler to run the backup job that triggers on the exit event of Tbird. |
#17
|
|||
|
|||
Why is this folder so slow?
On 4/27/2020 2:03 PM, VanguardLH wrote:
What did you use to check if there were junctions defined within the folder? For example, you could use Nirsoft's NTFSLinksView tool to scan for junctions to list them. You can specify the start folder from where to search, like the folder with the 500K+ files, or search from the root folder of a drive (junctions cannot point to other drives). Alas, if you pick the problematic folder, a scan will only show any junctions in that folder, not those that point at that folder. You might want to scan from the root folder, and then check if that folder is under a junction. Windows has been using junctions for a long time, especially when Microsoft decides to change the name of the special folder, like changing "Documents and Settings", the old name, and "Documents", that both point to C:\Users. Could be your problematic folder is under a junction, like Documents. I don't have to look for junctions, I know where they are. If there were junctions here, I would have put them in myself, otherwise they aren't there. Yousuf Khan |
#18
|
|||
|
|||
Why is this folder so slow?
On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably configured your Thunderbird News account(s) to use one file for each article instead of one file for each newsgroup. If so, it's probably best to bite the bullet and convert to one file per newsgroup. That probably needs an export and (re-)import and probably will be time-consuming, but at least then you'll solve the actual problem. FYI, my setup - not Thunderbird - has nearly a million articles, but only some 600 files. Yes, that is exactly the problem, I was getting at. Does Thunderbird have a new news file format available? My assumption was that Thunderbird only does 1 file/message? What's the option to convert? Yousuf Khan |
#19
|
|||
|
|||
Why is this folder so slow?
On 4/27/2020 2:29 PM, VanguardLH wrote:
As a test, disable your anti-virus software and run your TB data-only backup job. Yes, that's been done years ago too. This folder has been a major headache for years now. And at one time, I found that the AV software spending tons of time scanning this folder too, so I put an exclusion in it for this folder. The AV doesn't ever scan in this folder anymore. As another test, make sure to*exit* Thunderbird (check there are no instances of TB in Task Manager's Processes tab), and check if the backup job is just as slow. Yeah, but it doesn't matter, Thunderbird's email folders don't suffer from this problem. So even if Thunderbird were running in the background, and even if it were VSS aware, then this problem would be happening during backups of the email store as well, but it's only happening in the newsgroup store. The email store is much, much more active than the newsgroup store, but emails aren't affected, just newsgroups. VSS will encounter problems with databases that are not VSS aware. Microsoft's SQL Server is VSS aware, but others are not. The recommendation in backup programs, even those using VSS, for database programs that are not VSS aware is to schedule their shutdown before the backup, schedule the backup while the database program is down, and restart the database program after the backup finishes. While this can be done using Task Scheduler using event triggers (provided the database program issues an event on shutdown), it's a pain to figure out the script-like code you have to use to define for the trigger of the scheduled event. There are schedulers that are more flexible that can make their events dependent: task 3 runs only after task 2 ran and returned good status which runs only after task 1 completed and returned good status. Thunderbird never downloads newsgroup messages in the background, like it does with email, it only downloads them when you explicitly open the newsgroups account. This is also related to what I said above about how much more busier the Thunderbird email store is compared to the newsgroup store. Thunderbird may be doing things in the background but only with email. It's not related to VSS, I've already given you the most likely cause of the problem: there are over half million files, and each file is inefficiently taking up little over half of the NTFS cluster, rather than spreading a lesser number of files over many clusters. The real question is how can we make NTFS more efficient at handling all of these little files? NTFS is great at handling big files, but tiny little files no so much. Yousuf Khan |
#20
|
|||
|
|||
Why is this folder so slow?
On 2020-04-26 18:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space altogether, and there are 580,000 files here. Indicates that per file it's using up a little bit over half of a cluster on average. File system is NTFS. Meanwhile, this same drive can backup the remainder of the drive in under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this inefficient for small files like this? Â*Â*Â*Â*Yousuf Khan Hi Yousuf, When I see things like this, it is usually a failing drive, especially when the index on teh offending directory never finishes. This will show up like a soar thumb if yo run your drive through gsmartcontrol: check the error logs and run the self tests http://gsmartcontrol.sourceforge.net....php/Downloads Get back to us! -T |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
DH57dd/core i3 slow boot, slow wake up | cew | Intel | 2 | May 20th 12 12:52 AM |
p5n32-sli se deluxe and Vista x64 and 2gb (slow enough) then 4gb (horribly slow issues) | markm75 | Asus Motherboards | 1 | August 26th 07 01:51 PM |
scan to folder | Ricky | Printers | 5 | April 29th 05 02:35 PM |
ATI folder | Rob | Ati Videocards | 3 | January 25th 05 12:04 AM |
Cannot remove folder | sunksnook | Storage (alternative) | 0 | September 15th 03 02:00 AM |