A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage (alternative)
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Why is this folder so slow?



 
 
Thread Tools Display Modes
  #11  
Old April 27th 20, 03:30 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Ken Blake[_4_]
external usenet poster
 
Posts: 11
Default Why is this folder so slow?

On 4/26/2020 6:46 PM, Paul wrote:
Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Yousuf Khan


Have you tried to "defragment" the drive ?



Since the folder is on an SSD, fragmentation shouldn't make any difference.

--
Ken
  #12  
Old April 27th 20, 05:04 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Frank Slootweg
external usenet poster
 
Posts: 46
Default Why is this folder so slow?

Yousuf Khan wrote:
[...]

VSS is used on all of the backup jobs. None of the others exhibit this
behaviour. In fact, I've experienced this issue for nearly a decade now.
The problem started on Windows XP, continued on into Windows 7, and
continues to plague me in Windows 10. This particular folder has also
been migrated around from HDD to SSD, to a 2nd SSD, etc. So it's not a
problem that is specific to HDD's or SSD's, or to any particular version
of Windows.

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.


If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
whole the other 4 are not that busy.


My guess it that this processing is spent getting the hundreds of
thousands of files into and out of the file system cache.

Yousuf Khan

  #13  
Old April 27th 20, 05:18 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
😉 Good Guy 😉
external usenet poster
 
Posts: 10
Default Why is this folder so slow?

On 27/04/2020 02:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to
back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?



NTFS is pretty efficient but some users of Windows 10 machine aren't.Â*
Also, your machine must be showing signs of suspicious activities so the
backup program must scan it to see if there are any imminent threat to
the society in general by your sordid activities.



--
With over 1.2 billion devices now running Windows 10, customer
satisfaction is higher than any previous version of windows.

  #14  
Old April 27th 20, 07:03 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
VanguardLH[_2_]
external usenet poster
 
Posts: 1,453
Default Why is this folder so slow?

Yousuf Khan wrote:

On 4/26/2020 9:32 PM, VanguardLH wrote:
Using WHAT backup software? Doing a file-based or image-based backup?


Macrium, file-based.

Is it a direct access to the folder, or are you using a redirection,
like a junction (reparse point)? Does that folder itself have any
redirections which could run the backup program into a loop if it
doesn't specifically ignore those?


No, none of that. Straightforward unredirected.


What did you use to check if there were junctions defined within the
folder? For example, you could use Nirsoft's NTFSLinksView tool to scan
for junctions to list them. You can specify the start folder from where
to search, like the folder with the 500K+ files, or search from the root
folder of a drive (junctions cannot point to other drives). Alas, if
you pick the problematic folder, a scan will only show any junctions in
that folder, not those that point at that folder. You might want to
scan from the root folder, and then check if that folder is under a
junction. Windows has been using junctions for a long time, especially
when Microsoft decides to change the name of the special folder, like
changing "Documents and Settings", the old name, and "Documents", that
both point to C:\Users. Could be your problematic folder is under a
junction, like Documents.

https://knowledgebase.macrium.com/pa...ageId=23397420

That gives some information. As I recall, Macrium is supposed to ignore
symlinks and junctions when creating backups (to prevent looping). That
is, it still records the reparse points, but it shouldn't follow them.

Make damn sure that Macrium Reflect is *NOT* following reparse points
(recording them is okay, but following them during a backup is usually
not okay). Go into Reflect under its Other Tasks menu to select Edit
Defaults. Under the Backup tab, and under the Reparse Points category,
make sure "System - Do not follow" is selected. However, the default
for User Reparse Points is to follow them, but I've seen users screw
them up and generate circular links. See what happens when you set
"User - Do not follow". Those are for the default settings used when
you /create/ a backup job. For old saved job definitions, they may
differ than the current global defaults. Also go into the backup job's
definition and set the reparse follow options the same ("Do not follow"
for both system and user defined reparse points).

You could run a test by moving or copying the problematic folder to
elsewhere that is guaranteed not to be under a junction (after first
checking the folder itself has no junctions), like copying the folder to
C:\problemfolder, and then having Reflect backup just that folder.

Are the files in the problematic folder in use? If open for write,
another process has to either wait for the file handle to close (get
deleted) or times out. Although I also use Macrium Reflect, configuring
it to run pre- and post-job commands is *very* clumsy. You have to
create a Powershell, VBscript, or batch file and have Macrium run that
as its scheduled task. Once you create the script template, you edit it
to add your own commands before or after the backup job. The problem
that I've run into is that Reflect will have the script run the backup
job by calling Reflect as a service which has admin privileges, but
doesn't load the command shell itself with admin privs in which the
script runs, so commands you enter there that require admin privs won't
run. There might be a way around that, but I gave up on Reflect's
clumsy pre- and post-command workaround feature, plus you have to
maintain the script instead of having an easily configurable command
line to edit in the Reflect GUI when creating or editing a backup job.
However, if you can get Reflect's script feature to work to emulate a
pre- and post-job feature, you might look at running the SysInternals'
handle.exe command to see which files might be in-use (have open file
handles) before the backup job starts.

Getting locked out from reading a file can be thwarted by using VSS
(Volume Shadow Service). I'm pretty sure on image backups that Reflect
defaults to using VSS. I don't see an option to not use VSS. However,
under Other Tasks menu, Advanced tab, check if Reflect will
"Automatically retry without VSS writers on failure". If there is a
problem with VSS, Reflect will try to backup without VSS.

Also check the VSS service will change into Running status. Go into
Windows services (services.msc), scroll down to "Volume Shadow Copy"
service. It should be set to Manual startup mode, and not Disabled. It
runs when called. It does not stay running during the entire time that
Windows is running. It is only needed when a shadow copy is needed to
get at in-use or system-restricted files, and you are not backing up the
entire time you have Windows loaded. If you go into Event Viewer,
Application logs, and filter on event ID 8224, you'll see informational
events for "The VSS service is shutting down due to idle timeout."

I forget the idle interval, probably 15 minutes, but once started the
VSS service will eventually stop after the last time it got called by a
VSS requestor and after a VSS writer has completed its task. Those
users that whine the VSS has idle-stopped don't understand this service
is not meant to be always running (Automatic mode). It is manually
called by a requestor, used for a while, and then it stops because it's
not being used anymore. Been that way since Microsoft introduced VSS
back in Windows XP to facilitate backing up of in-use and system files.

https://docs.microsoft.com/en-us/win...w-copy-service

Right click on that service and select Start, or select it and click the
Start button. Did it change into Running status (for awhile)?

Some programs install their own VSS writers. As I recall, Paragon
supplied their own optional VSS writer you could select instead of using
the Windows-provided one. Reflect uses the copy-on-write writer already
provided by Windows. You can see a list of VSS writers by running in a
command shell:

vssadmin list writers

Sorry, I haven't delved far enough into this to know which system VSS
writer that Reflect will employ. Might be the ASR Writer as noted at
https://docs.microsoft.com/en-us/win...ox-vss-writers.
Not sure even Reflect cares, as it likely just issues some system API
call to use VSS.

VSS is only usable when NTFS is used as the file system. You didn't
mention WHERE is the problematic folder. If it is a folder on an
internal drive that uses NTFS, VSS can come into play (if the targeted
files are locked). If the folder is on some external storage media,
like a USB HDD or flash drive, could be that uses FAT32 or some other
file system than NTFS, so VSS can't be used there.

If VSS fails when called by Macrium Reflect, the backup job's log should
note the error. See:

https://knowledgebase.macrium.com/di...oft+VSS+errors
  #15  
Old April 27th 20, 07:09 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Paul[_28_]
external usenet poster
 
Posts: 1,467
Default Why is this folder so slow?

Yousuf Khan wrote:
On 4/27/2020 3:57 AM, Paul wrote:
I seem to remember at some time in the past, you offered
advice on putting an exception for an AV program,
so it does not scan that particular directory
(something in Thunderbird).

If your CPU cores are railed, I'd be tracing down the
PID of the offender.

One way to do it on a Pro SKU of OS, is

tasklist /svc # should not work on Home


Not even necessary, I can tell you right now which process is
responsible, it's the Macrium Reflect binary. Also the System process
which I assume the Reflect binary also makes heavy use of during this time.


OK, show me a chunk of nfi.exe output, just
for files in the magical folder. Just enough
to capture the essence of what's going on.

nfi.exe is in here (13,529,558 bytes)

https://web.archive.org/web/20070104...s/oem3sr2s.zip

Run

nfi.exe C: c_nfi.txt

This is what a file looks like, followed by a directory.
A directory has a $I30 entry in it.

File 5468

\YOUTUBE_CAP\out_linux_ffmpeg2.avi
$STANDARD_INFORMATION (resident)
$FILE_NAME (resident)
$DATA (nonresident)
logical sectors 2576342736-2577800527 (0x998fded0-0x99a61d4f)

File 5463

\YOUTUBE_CAP
$STANDARD_INFORMATION (resident)
$FILE_NAME (resident)
$INDEX_ROOT $I30 (resident)
$INDEX_ALLOCATION $I30 (nonresident)
logical sectors 2577800616-2577800623 (0x99a61da8-0x99a61daf)
$BITMAP $I30 (resident)

What we're looking for here, is something
like an extended attribute.

You might also use fsutil, and verify the cluster
size (4KB default). Windows 10 stopped tolerating
non-default cluster sizes on C: about three OSes ago,
so it pretty well has to be 4KB now on cluster size.

One reason I want some info about your 800,000 file folder,
is I want to see if there are no logical sectors
(small files, like 1KB files, fit within $MFT and don't
use clusters for the data storage). Or I was io see if
the clusters are fragmented.

One other thing Windows 10 does now, is they added a small
write cache (per handle). The write cache has "ruined" the
notion of fragmentation, in the sense that no fragment
can be 4KB. The buffer is 64KB. If a file fragments today in
Windows 10, the chunk size should be 64KB.

I use the Passmark fragment generator, to create fragmented
files for test. I noticed that if the Passmark fragment
generator is run on modern Windows 10, the fragments
don't seem to be any smaller than 64KB. If I run under an
older OS, you can see on the screen (JKDefrag) that the
fragments are finer. I do these tests on a RAMDisk so
no harm comes to any physical storage devices.

You might ask "I have a 4KB file to store, what happens
with the 64KB buffer in that case". I don't know. Obviously it
cannot break, or we'd have heard about it by now. The buffer must
flush when the handle closes.

I only mention this new feature, in case you examine your
800,000 files and notice there's no fragmentation at all.

Paul
  #16  
Old April 27th 20, 07:29 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
VanguardLH[_2_]
external usenet poster
 
Posts: 1,453
Default Why is this folder so slow?

Yousuf Khan wrote:

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
while the other 4 are not that busy.

Yousuf Khan


As a test, disable your anti-virus software and run your TB data-only
backup job.

As another test, make sure to *exit* Thunderbird (check there are no
instances of TB in Task Manager's Processes tab), and check if the
backup job is just as slow.

Do you leave TB running all the time? Does the backup job run as a
scheduled event at a time after you would've unloaded TB, like you use
TB during the day (say 8AM to 11 PM), unload it when done, and you
schedule the backup job to run early morning (say 4 AM)?

VSS will encounter problems with databases that are not VSS aware.
Microsoft's SQL Server is VSS aware, but others are not. The
recommendation in backup programs, even those using VSS, for database
programs that are not VSS aware is to schedule their shutdown before the
backup, schedule the backup while the database program is down, and
restart the database program after the backup finishes. While this can
be done using Task Scheduler using event triggers (provided the database
program issues an event on shutdown), it's a pain to figure out the
script-like code you have to use to define for the trigger of the
scheduled event. There are schedulers that are more flexible that can
make their events dependent: task 3 runs only after task 2 ran and
returned good status which runs only after task 1 completed and returned
good status.

https://knowledgebase.macrium.com/di...ware+databases

I sincerely doubt Thunderbird provides its own VSS writer. What does
Tbird use to manage its message store? Isn't it SQLite? SQLite is not
a VSS-aware database program. In fact, it isn't a database program at
all. It's a library from which some program can call its functions (aka
methods). It would be up to the calling program to be VSS-aware, and I
doubt Mozilla ever added that to Tbird.

http://sqlite.1065341.n5.nabble.com/...r-td85887.html

I remember back when using MS Outlook with POP which stored its message
store in a PST file that backups would often skip that database. While
Outlook was running, its database couldn't be backed up because it
wasn't only in-use but also locked as a database. MS didn't provide a
VSS writer just for Outlook. Some users used batch files that would
kill Outlook, run the backup (to include Outlook's message store), and
reload Outlook after the backup finished. However, Outlook has no way
to gracefully unload it. There is no command-line switch for Outlook to
ask it to unload. You had to kill it, and that's always a bad way to
smash a program with open files since corruption can occur to the files.
Some backup programs worked around the problem by installing an
extension into Outlook that would exit it and start the backup program,
and the backup program would later restart Outlook. I'm sure there were
other workarounds. Since Outlook is a client, not a server, there
really was no need to leave it running 24x7, but a lot of users ran it
that way, so it available upon their return to their computer.

Not all programs that manage a database are VSS-aware. Usually the
easiest solution is to make sure the program using the database is not
running at the time of the backup job. Does Tbird have a command-line
switch that will unload the currently loaded instance(s) of Tbird?
Using taskkill.exe is abrupt and can result in file corruption. If
Tbird can be requested to gracefully shutdown, you could do that in a
script, run the backup job, and reload Tbird (if you can get scripts via
Powershell, VBscript, or batch to work in Reflect).

I doubt Tbird generates an event when it exits (i.e., you don't see
anything in Event Viewer). If it does, you can define a scheduled event
in Task Scheduler to run the backup job that triggers on the exit event
of Tbird.
  #17  
Old April 27th 20, 10:17 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Yousuf Khan[_2_]
external usenet poster
 
Posts: 1,296
Default Why is this folder so slow?

On 4/27/2020 2:03 PM, VanguardLH wrote:
What did you use to check if there were junctions defined within the
folder? For example, you could use Nirsoft's NTFSLinksView tool to scan
for junctions to list them. You can specify the start folder from where
to search, like the folder with the 500K+ files, or search from the root
folder of a drive (junctions cannot point to other drives). Alas, if
you pick the problematic folder, a scan will only show any junctions in
that folder, not those that point at that folder. You might want to
scan from the root folder, and then check if that folder is under a
junction. Windows has been using junctions for a long time, especially
when Microsoft decides to change the name of the special folder, like
changing "Documents and Settings", the old name, and "Documents", that
both point to C:\Users. Could be your problematic folder is under a
junction, like Documents.


I don't have to look for junctions, I know where they are. If there were
junctions here, I would have put them in myself, otherwise they aren't
there.

Yousuf Khan
  #18  
Old April 27th 20, 10:22 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Yousuf Khan[_2_]
external usenet poster
 
Posts: 1,296
Default Why is this folder so slow?

On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.


Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?

Yousuf Khan
  #19  
Old April 27th 20, 10:44 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
Yousuf Khan[_2_]
external usenet poster
 
Posts: 1,296
Default Why is this folder so slow?

On 4/27/2020 2:29 PM, VanguardLH wrote:
As a test, disable your anti-virus software and run your TB data-only
backup job.


Yes, that's been done years ago too. This folder has been a major
headache for years now. And at one time, I found that the AV software
spending tons of time scanning this folder too, so I put an exclusion in
it for this folder. The AV doesn't ever scan in this folder anymore.

As another test, make sure to*exit* Thunderbird (check there are no
instances of TB in Task Manager's Processes tab), and check if the
backup job is just as slow.


Yeah, but it doesn't matter, Thunderbird's email folders don't suffer
from this problem. So even if Thunderbird were running in the
background, and even if it were VSS aware, then this problem would be
happening during backups of the email store as well, but it's only
happening in the newsgroup store. The email store is much, much more
active than the newsgroup store, but emails aren't affected, just
newsgroups.

VSS will encounter problems with databases that are not VSS aware.
Microsoft's SQL Server is VSS aware, but others are not. The
recommendation in backup programs, even those using VSS, for database
programs that are not VSS aware is to schedule their shutdown before the
backup, schedule the backup while the database program is down, and
restart the database program after the backup finishes. While this can
be done using Task Scheduler using event triggers (provided the database
program issues an event on shutdown), it's a pain to figure out the
script-like code you have to use to define for the trigger of the
scheduled event. There are schedulers that are more flexible that can
make their events dependent: task 3 runs only after task 2 ran and
returned good status which runs only after task 1 completed and returned
good status.


Thunderbird never downloads newsgroup messages in the background, like
it does with email, it only downloads them when you explicitly open the
newsgroups account. This is also related to what I said above about how
much more busier the Thunderbird email store is compared to the
newsgroup store. Thunderbird may be doing things in the background but
only with email.

It's not related to VSS, I've already given you the most likely cause of
the problem: there are over half million files, and each file is
inefficiently taking up little over half of the NTFS cluster, rather
than spreading a lesser number of files over many clusters. The real
question is how can we make NTFS more efficient at handling all of these
little files? NTFS is great at handling big files, but tiny little files
no so much.

Yousuf Khan
  #20  
Old April 27th 20, 11:15 PM posted to alt.comp.os.windows-10,comp.sys.ibm.pc.hardware.storage
T[_6_]
external usenet poster
 
Posts: 49
Default Why is this folder so slow?

On 2020-04-26 18:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Â*Â*Â*Â*Yousuf Khan


Hi Yousuf,

When I see things like this, it is usually a failing
drive, especially when the index on teh offending
directory never finishes.

This will show up like a soar thumb if yo run your
drive through gsmartcontrol: check the error logs and
run the self tests

http://gsmartcontrol.sourceforge.net....php/Downloads

Get back to us!

-T



 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
DH57dd/core i3 slow boot, slow wake up cew Intel 2 May 20th 12 12:52 AM
p5n32-sli se deluxe and Vista x64 and 2gb (slow enough) then 4gb (horribly slow issues) markm75 Asus Motherboards 1 August 26th 07 01:51 PM
scan to folder Ricky Printers 5 April 29th 05 02:35 PM
ATI folder Rob Ati Videocards 3 January 25th 05 01:04 AM
Cannot remove folder sunksnook Storage (alternative) 0 September 15th 03 02:00 AM


All times are GMT +1. The time now is 10:26 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.