HardwareBanter

HardwareBanter (http://www.hardwarebanter.com/index.php)
-   Storage (alternative) (http://www.hardwarebanter.com/forumdisplay.php?f=31)
-   -   Why is this folder so slow? (http://www.hardwarebanter.com/showthread.php?t=200073)

Yousuf Khan[_2_] April 27th 20 02:24 AM

Why is this folder so slow?
 
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Yousuf Khan

VanguardLH[_2_] April 27th 20 02:32 AM

Why is this folder so slow?
 
Yousuf Khan wrote:

I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?


Using WHAT backup software? Doing a file-based or image-based backup?

Is it a direct access to the folder, or are you using a redirection,
like a junction (reparse point)? Does that folder itself have any
redirections which could run the backup program into a loop if it
doesn't specifically ignore those?

Paul[_28_] April 27th 20 02:46 AM

Why is this folder so slow?
 
Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Yousuf Khan


Have you tried to "defragment" the drive ?

Normally, the "optimize" dialog will not offer defragmentation
as an option in Windows 10. It's supposed to offer "TRIM" as
the option for an SSD.

However, there is a "Copy On Write" or COW issue with SSDs.
Under the right circumstances, there will be a slowdown.

Now, consider what you're doing. Your backup software uses VSS
to make a shadow copy. It's possible some "COW activity" is happening
during the backup.

The Optimize dialog knows about this, and the Optimize dialog
has some sort of metric it uses to decide what to do. While
most of the time, it will only offer TRIM, I bet in your
case, it's "going to have a COW" and defragment your drive.
This should not be as thorough as a regular defragment,
and the design of what's done, should have something to do
with whatever the root cause of "having a COW" is.

I've not seen this slow behavior here, so have no
first hand experiences to offer on it. Note that over the
years Windows 10 has existed, the behavior of the Optimize
panel has been "as crazy as Cocoa Puffs". The software
frequently could not properly tell an HDD from an SSD,
and it would be damn hard to see any "subtle" behaviors,
when this software has had so many bugs in the past. I've
had a machine full of HDDs offer nothing but TRIM and
the Optimize panel declared all my drives as SSD drives.
Which is total bull**** and most annoying when you
actually want the defrag to work. As far as I can remember,
Optimize is working in 1909 OK now. It's been a hell of a
bumpy ride though, over the years.

See if you're offered a defrag option.

Do Properties on the drive letter, and in the Tools tab
you'll find the Optimize. Then retest your backup rate
after the partition has been cleaned up.

Paul

Boris[_7_] April 27th 20 05:01 AM

Why is this folder so slow?
 
VanguardLH wrote in :

Yousuf Khan wrote:

I have a folder on one of my SSD drives that takes 8 to 10 hours to
back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?


Using WHAT backup software? Doing a file-based or image-based backup?

Is it a direct access to the folder, or are you using a redirection,
like a junction (reparse point)? Does that folder itself have any
redirections which could run the backup program into a loop if it
doesn't specifically ignore those?


That may be it. I remember reading about junctions causing havoc if they
were in a backup scheme (I think in a folder/file backup). Amazingly,
there was some helpful information (for me, at the time) on a Microsoft
forum, about identifying junctions, found in paragraph two of darrenc1's
answer.

To the OP:

https://answers.microsoft.com/en-us/...rum/windows_7-
performance/what-is-a-reparse-point-can-anyone-reveal-the/17b9b457-6c8a-
4e83-a445-e603011a6b95

or

https://tinyurl.com/y8hssmg6


Yousuf Khan[_2_] April 27th 20 06:33 AM

Why is this folder so slow?
 
On 4/26/2020 9:32 PM, VanguardLH wrote:
Using WHAT backup software? Doing a file-based or image-based backup?


Macrium, file-based.

Is it a direct access to the folder, or are you using a redirection,
like a junction (reparse point)? Does that folder itself have any
redirections which could run the backup program into a loop if it
doesn't specifically ignore those?


No, none of that. Straightforward unredirected.

Yousuf Khan


Yousuf Khan[_2_] April 27th 20 07:06 AM

Why is this folder so slow?
 
On 4/26/2020 9:46 PM, Paul wrote:
Have you tried to "defragment" the drive ?


No, considering it's an SSD. But as you pointed out later, the optimize
option is available for both of my SSD's, but optimize recognizes them
as SSD's, so the only optimization available is trimming, no defragging.

Normally, the "optimize" dialog will not offer defragmentation
as an option in Windows 10. It's supposed to offer "TRIM" as
the option for an SSD.

However, there is a "Copy On Write" or COW issue with SSDs.
Under the right circumstances, there will be a slowdown.


Yes, likely this is exactly that circumstance. Do you know what the
symptoms of that circumstance are?

Now, consider what you're doing. Your backup software uses VSS
to make a shadow copy. It's possible some "COW activity" is happening
during the backup.


Yes, VSS is used by the software, which is Macrium Reflect 6 BTW.
Reflect's logs show that it creates the VSS shadows immediately before
beginning the backup.

This backup runs after midnight, and there is little activity while any
of the backups run. All of the backups run after midnight and they
finish relatively quickly, except this one.

The Optimize dialog knows about this, and the Optimize dialog
has some sort of metric it uses to decide what to do. While
most of the time, it will only offer TRIM, I bet in your
case, it's "going to have a COW" and defragment your drive.
This should not be as thorough as a regular defragment,
and the design of what's done, should have something to do
with whatever the root cause of "having a COW" is.


VSS is used on all of the backup jobs. None of the others exhibit this
behaviour. In fact, I've experienced this issue for nearly a decade now.
The problem started on Windows XP, continued on into Windows 7, and
continues to plague me in Windows 10. This particular folder has also
been migrated around from HDD to SSD, to a 2nd SSD, etc. So it's not a
problem that is specific to HDD's or SSD's, or to any particular version
of Windows.

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
while the other 4 are not that busy.

Yousuf Khan

Paul[_28_] April 27th 20 08:57 AM

Why is this folder so slow?
 
Yousuf Khan wrote:


Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
while the other 4 are not that busy.

Yousuf Khan


I seem to remember at some time in the past, you offered
advice on putting an exception for an AV program,
so it does not scan that particular directory
(something in Thunderbird).

If your CPU cores are railed, I'd be tracing down the
PID of the offender.

One way to do it on a Pro SKU of OS, is

tasklist /svc # should not work on Home

and that will tell you what is inside a SVCHOST. You
can also do that with Process Explorer from Sysinternals,
running concurrently with Task Manager, and flip over
to Process Explorer to see what is in a busy PID in
Task Manager. If you elevate Process Explorer using
"Run as Administrator", it can even take a stack snapshot
of a SVCHOST, and you can get additional information.

For example, I have a SVCHOST with 15 things in it,
and one is wuauserv. If a Windows Update scan is running,
that SVCHOST lights up -- but then you have to guess
that's the guilty service, as the rest of the services
aren't normally a problem.

When Macrium is running, CPU effort goes into two things:

1) Running a checksum to stamp the .mrimg when finished.
This detects corruption later (like when restoring perhaps).

2) Compression. If the lightweight compressor is turned on,
that will use a core. I don't think Macrium uses multi-core
for its compressor.

If you were seeing more than that, I'd be looking at
MsMpEng as a culprit, as it could cause quite a penalty
if every small file involved a scan by the Windows Defender.

When I ran hashdeep64 in Windows 10, I think the calc
ran 8x slower than normal, to give some idea what a
penalty Windows Defender causes on reads.

Paul

Todesco April 27th 20 01:28 PM

Why is this folder so slow?
 
On 4/27/2020 2:06 AM, Yousuf Khan wrote:
On 4/26/2020 9:46 PM, Paul wrote:
Have you tried to "defragment" the drive ?


No, considering it's an SSD. But as you pointed out later, the optimize
option is available for both of my SSD's, but optimize recognizes them
as SSD's, so the only optimization available is trimming, no defragging.

Normally, the "optimize" dialog will not offer defragmentation
as an option in Windows 10. It's supposed to offer "TRIM" as
the option for an SSD.

However, there is a "Copy On Write" or COW issue with SSDs.
Under the right circumstances, there will be a slowdown.


Yes, likely this is exactly that circumstance. Do you know what the
symptoms of that circumstance are?

Now, consider what you're doing. Your backup software uses VSS
to make a shadow copy. It's possible some "COW activity" is happening
during the backup.


Yes, VSS is used by the software, which is Macrium Reflect 6 BTW.
Reflect's logs show that it creates the VSS shadows immediately before
beginning the backup.

This backup runs after midnight, and there is little activity while any
of the backups run. All of the backups run after midnight and they
finish relatively quickly, except this one.

The Optimize dialog knows about this, and the Optimize dialog
has some sort of metric it uses to decide what to do. While
most of the time, it will only offer TRIM, I bet in your
case, it's "going to have a COW" and defragment your drive.
This should not be as thorough as a regular defragment,
and the design of what's done, should have something to do
with whatever the root cause of "having a COW" is.


VSS is used on all of the backup jobs. None of the others exhibit this
behaviour. In fact, I've experienced this issue for nearly a decade now.
The problem started on Windows XP, continued on into Windows 7, and
continues to plague me in Windows 10. This particular folder has also
been migrated around from HDD to SSD, to a 2nd SSD, etc. So it's not a
problem that is specific to HDD's or SSD's, or to any particular version
of Windows.

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
while the other 4 are not that busy.

Â*Â*Â*Â*Yousuf Khan

Not really sure, but I think TB does compression on its files. If you
don't allow it, that might be the cause.

Yousuf Khan[_2_] April 27th 20 03:16 PM

Why is this folder so slow?
 
On 4/27/2020 8:28 AM, Todesco wrote:
Not really sure, but I think TB does compression on its files.Â* If you
don't allow it, that might be the cause.


It does that only when it's active and running, in this case it's not
running. Also it doesn't compress newsgroup files, just email files.

Yousuf Khan[_2_] April 27th 20 03:18 PM

Why is this folder so slow?
 
On 4/27/2020 3:57 AM, Paul wrote:
I seem to remember at some time in the past, you offered
advice on putting an exception for an AV program,
so it does not scan that particular directory
(something in Thunderbird).

If your CPU cores are railed, I'd be tracing down the
PID of the offender.

One way to do it on a Pro SKU of OS, is

Â*Â* tasklist /svcÂ*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* # should not work on Home


Not even necessary, I can tell you right now which process is
responsible, it's the Macrium Reflect binary. Also the System process
which I assume the Reflect binary also makes heavy use of during this time.

Ken Blake[_4_] April 27th 20 03:30 PM

Why is this folder so slow?
 
On 4/26/2020 6:46 PM, Paul wrote:
Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Yousuf Khan


Have you tried to "defragment" the drive ?



Since the folder is on an SSD, fragmentation shouldn't make any difference.

--
Ken

Frank Slootweg April 27th 20 05:04 PM

Why is this folder so slow?
 
Yousuf Khan wrote:
[...]

VSS is used on all of the backup jobs. None of the others exhibit this
behaviour. In fact, I've experienced this issue for nearly a decade now.
The problem started on Windows XP, continued on into Windows 7, and
continues to plague me in Windows 10. This particular folder has also
been migrated around from HDD to SSD, to a 2nd SSD, etc. So it's not a
problem that is specific to HDD's or SSD's, or to any particular version
of Windows.

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.


If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
whole the other 4 are not that busy.


My guess it that this processing is spent getting the hundreds of
thousands of files into and out of the file system cache.

Yousuf Khan


😉 Good Guy 😉 April 27th 20 05:18 PM

Why is this folder so slow?
 
On 27/04/2020 02:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to
back up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?



NTFS is pretty efficient but some users of Windows 10 machine aren't.Â*
Also, your machine must be showing signs of suspicious activities so the
backup program must scan it to see if there are any imminent threat to
the society in general by your sordid activities.



--
With over 1.2 billion devices now running Windows 10, customer
satisfaction is higher than any previous version of windows.


VanguardLH[_2_] April 27th 20 07:03 PM

Why is this folder so slow?
 
Yousuf Khan wrote:

On 4/26/2020 9:32 PM, VanguardLH wrote:
Using WHAT backup software? Doing a file-based or image-based backup?


Macrium, file-based.

Is it a direct access to the folder, or are you using a redirection,
like a junction (reparse point)? Does that folder itself have any
redirections which could run the backup program into a loop if it
doesn't specifically ignore those?


No, none of that. Straightforward unredirected.


What did you use to check if there were junctions defined within the
folder? For example, you could use Nirsoft's NTFSLinksView tool to scan
for junctions to list them. You can specify the start folder from where
to search, like the folder with the 500K+ files, or search from the root
folder of a drive (junctions cannot point to other drives). Alas, if
you pick the problematic folder, a scan will only show any junctions in
that folder, not those that point at that folder. You might want to
scan from the root folder, and then check if that folder is under a
junction. Windows has been using junctions for a long time, especially
when Microsoft decides to change the name of the special folder, like
changing "Documents and Settings", the old name, and "Documents", that
both point to C:\Users. Could be your problematic folder is under a
junction, like Documents.

https://knowledgebase.macrium.com/pa...ageId=23397420

That gives some information. As I recall, Macrium is supposed to ignore
symlinks and junctions when creating backups (to prevent looping). That
is, it still records the reparse points, but it shouldn't follow them.

Make damn sure that Macrium Reflect is *NOT* following reparse points
(recording them is okay, but following them during a backup is usually
not okay). Go into Reflect under its Other Tasks menu to select Edit
Defaults. Under the Backup tab, and under the Reparse Points category,
make sure "System - Do not follow" is selected. However, the default
for User Reparse Points is to follow them, but I've seen users screw
them up and generate circular links. See what happens when you set
"User - Do not follow". Those are for the default settings used when
you /create/ a backup job. For old saved job definitions, they may
differ than the current global defaults. Also go into the backup job's
definition and set the reparse follow options the same ("Do not follow"
for both system and user defined reparse points).

You could run a test by moving or copying the problematic folder to
elsewhere that is guaranteed not to be under a junction (after first
checking the folder itself has no junctions), like copying the folder to
C:\problemfolder, and then having Reflect backup just that folder.

Are the files in the problematic folder in use? If open for write,
another process has to either wait for the file handle to close (get
deleted) or times out. Although I also use Macrium Reflect, configuring
it to run pre- and post-job commands is *very* clumsy. You have to
create a Powershell, VBscript, or batch file and have Macrium run that
as its scheduled task. Once you create the script template, you edit it
to add your own commands before or after the backup job. The problem
that I've run into is that Reflect will have the script run the backup
job by calling Reflect as a service which has admin privileges, but
doesn't load the command shell itself with admin privs in which the
script runs, so commands you enter there that require admin privs won't
run. There might be a way around that, but I gave up on Reflect's
clumsy pre- and post-command workaround feature, plus you have to
maintain the script instead of having an easily configurable command
line to edit in the Reflect GUI when creating or editing a backup job.
However, if you can get Reflect's script feature to work to emulate a
pre- and post-job feature, you might look at running the SysInternals'
handle.exe command to see which files might be in-use (have open file
handles) before the backup job starts.

Getting locked out from reading a file can be thwarted by using VSS
(Volume Shadow Service). I'm pretty sure on image backups that Reflect
defaults to using VSS. I don't see an option to not use VSS. However,
under Other Tasks menu, Advanced tab, check if Reflect will
"Automatically retry without VSS writers on failure". If there is a
problem with VSS, Reflect will try to backup without VSS.

Also check the VSS service will change into Running status. Go into
Windows services (services.msc), scroll down to "Volume Shadow Copy"
service. It should be set to Manual startup mode, and not Disabled. It
runs when called. It does not stay running during the entire time that
Windows is running. It is only needed when a shadow copy is needed to
get at in-use or system-restricted files, and you are not backing up the
entire time you have Windows loaded. If you go into Event Viewer,
Application logs, and filter on event ID 8224, you'll see informational
events for "The VSS service is shutting down due to idle timeout."

I forget the idle interval, probably 15 minutes, but once started the
VSS service will eventually stop after the last time it got called by a
VSS requestor and after a VSS writer has completed its task. Those
users that whine the VSS has idle-stopped don't understand this service
is not meant to be always running (Automatic mode). It is manually
called by a requestor, used for a while, and then it stops because it's
not being used anymore. Been that way since Microsoft introduced VSS
back in Windows XP to facilitate backing up of in-use and system files.

https://docs.microsoft.com/en-us/win...w-copy-service

Right click on that service and select Start, or select it and click the
Start button. Did it change into Running status (for awhile)?

Some programs install their own VSS writers. As I recall, Paragon
supplied their own optional VSS writer you could select instead of using
the Windows-provided one. Reflect uses the copy-on-write writer already
provided by Windows. You can see a list of VSS writers by running in a
command shell:

vssadmin list writers

Sorry, I haven't delved far enough into this to know which system VSS
writer that Reflect will employ. Might be the ASR Writer as noted at
https://docs.microsoft.com/en-us/win...ox-vss-writers.
Not sure even Reflect cares, as it likely just issues some system API
call to use VSS.

VSS is only usable when NTFS is used as the file system. You didn't
mention WHERE is the problematic folder. If it is a folder on an
internal drive that uses NTFS, VSS can come into play (if the targeted
files are locked). If the folder is on some external storage media,
like a USB HDD or flash drive, could be that uses FAT32 or some other
file system than NTFS, so VSS can't be used there.

If VSS fails when called by Macrium Reflect, the backup job's log should
note the error. See:

https://knowledgebase.macrium.com/di...oft+VSS+errors

Paul[_28_] April 27th 20 07:09 PM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 4/27/2020 3:57 AM, Paul wrote:
I seem to remember at some time in the past, you offered
advice on putting an exception for an AV program,
so it does not scan that particular directory
(something in Thunderbird).

If your CPU cores are railed, I'd be tracing down the
PID of the offender.

One way to do it on a Pro SKU of OS, is

tasklist /svc # should not work on Home


Not even necessary, I can tell you right now which process is
responsible, it's the Macrium Reflect binary. Also the System process
which I assume the Reflect binary also makes heavy use of during this time.


OK, show me a chunk of nfi.exe output, just
for files in the magical folder. Just enough
to capture the essence of what's going on.

nfi.exe is in here (13,529,558 bytes)

https://web.archive.org/web/20070104...s/oem3sr2s.zip

Run

nfi.exe C: c_nfi.txt

This is what a file looks like, followed by a directory.
A directory has a $I30 entry in it.

File 5468

\YOUTUBE_CAP\out_linux_ffmpeg2.avi
$STANDARD_INFORMATION (resident)
$FILE_NAME (resident)
$DATA (nonresident)
logical sectors 2576342736-2577800527 (0x998fded0-0x99a61d4f)

File 5463

\YOUTUBE_CAP
$STANDARD_INFORMATION (resident)
$FILE_NAME (resident)
$INDEX_ROOT $I30 (resident)
$INDEX_ALLOCATION $I30 (nonresident)
logical sectors 2577800616-2577800623 (0x99a61da8-0x99a61daf)
$BITMAP $I30 (resident)

What we're looking for here, is something
like an extended attribute.

You might also use fsutil, and verify the cluster
size (4KB default). Windows 10 stopped tolerating
non-default cluster sizes on C: about three OSes ago,
so it pretty well has to be 4KB now on cluster size.

One reason I want some info about your 800,000 file folder,
is I want to see if there are no logical sectors
(small files, like 1KB files, fit within $MFT and don't
use clusters for the data storage). Or I was io see if
the clusters are fragmented.

One other thing Windows 10 does now, is they added a small
write cache (per handle). The write cache has "ruined" the
notion of fragmentation, in the sense that no fragment
can be 4KB. The buffer is 64KB. If a file fragments today in
Windows 10, the chunk size should be 64KB.

I use the Passmark fragment generator, to create fragmented
files for test. I noticed that if the Passmark fragment
generator is run on modern Windows 10, the fragments
don't seem to be any smaller than 64KB. If I run under an
older OS, you can see on the screen (JKDefrag) that the
fragments are finer. I do these tests on a RAMDisk so
no harm comes to any physical storage devices.

You might ask "I have a 4KB file to store, what happens
with the 64KB buffer in that case". I don't know. Obviously it
cannot break, or we'd have heard about it by now. The buffer must
flush when the handle closes.

I only mention this new feature, in case you examine your
800,000 files and notice there's no fragmentation at all.

Paul

VanguardLH[_2_] April 27th 20 07:29 PM

Why is this folder so slow?
 
Yousuf Khan wrote:

I'll tell you what this folder is. It's actually my Thunderbird News
folder (exactly what I'm using to ask this question here), which exists
under the my User folder structure. The problem was discovered when I
started doing daily backups of my User folder and discovered that the
User folder was taking forever. After investigating it some, I figured
out that the problem was this particular substructure under News. Once I
excluded the News folder, backups finished 6 times faster! So I moved
the backups of the News folder to their own job, and let the rest of the
User folder get backed up separately. Before, you ask, I only backup the
News folder once a week, but it's still a pain in the ass watching it
take so long even once a week.

Some other background. When this particular backup is happening, it's
not the drives that are showing as busy, it's the CPU cores! 4 out of
the 8 cores on my FX-8300 are fluctuating between 50% to 100% busy,
while the other 4 are not that busy.

Yousuf Khan


As a test, disable your anti-virus software and run your TB data-only
backup job.

As another test, make sure to *exit* Thunderbird (check there are no
instances of TB in Task Manager's Processes tab), and check if the
backup job is just as slow.

Do you leave TB running all the time? Does the backup job run as a
scheduled event at a time after you would've unloaded TB, like you use
TB during the day (say 8AM to 11 PM), unload it when done, and you
schedule the backup job to run early morning (say 4 AM)?

VSS will encounter problems with databases that are not VSS aware.
Microsoft's SQL Server is VSS aware, but others are not. The
recommendation in backup programs, even those using VSS, for database
programs that are not VSS aware is to schedule their shutdown before the
backup, schedule the backup while the database program is down, and
restart the database program after the backup finishes. While this can
be done using Task Scheduler using event triggers (provided the database
program issues an event on shutdown), it's a pain to figure out the
script-like code you have to use to define for the trigger of the
scheduled event. There are schedulers that are more flexible that can
make their events dependent: task 3 runs only after task 2 ran and
returned good status which runs only after task 1 completed and returned
good status.

https://knowledgebase.macrium.com/di...ware+databases

I sincerely doubt Thunderbird provides its own VSS writer. What does
Tbird use to manage its message store? Isn't it SQLite? SQLite is not
a VSS-aware database program. In fact, it isn't a database program at
all. It's a library from which some program can call its functions (aka
methods). It would be up to the calling program to be VSS-aware, and I
doubt Mozilla ever added that to Tbird.

http://sqlite.1065341.n5.nabble.com/...r-td85887.html

I remember back when using MS Outlook with POP which stored its message
store in a PST file that backups would often skip that database. While
Outlook was running, its database couldn't be backed up because it
wasn't only in-use but also locked as a database. MS didn't provide a
VSS writer just for Outlook. Some users used batch files that would
kill Outlook, run the backup (to include Outlook's message store), and
reload Outlook after the backup finished. However, Outlook has no way
to gracefully unload it. There is no command-line switch for Outlook to
ask it to unload. You had to kill it, and that's always a bad way to
smash a program with open files since corruption can occur to the files.
Some backup programs worked around the problem by installing an
extension into Outlook that would exit it and start the backup program,
and the backup program would later restart Outlook. I'm sure there were
other workarounds. Since Outlook is a client, not a server, there
really was no need to leave it running 24x7, but a lot of users ran it
that way, so it available upon their return to their computer.

Not all programs that manage a database are VSS-aware. Usually the
easiest solution is to make sure the program using the database is not
running at the time of the backup job. Does Tbird have a command-line
switch that will unload the currently loaded instance(s) of Tbird?
Using taskkill.exe is abrupt and can result in file corruption. If
Tbird can be requested to gracefully shutdown, you could do that in a
script, run the backup job, and reload Tbird (if you can get scripts via
Powershell, VBscript, or batch to work in Reflect).

I doubt Tbird generates an event when it exits (i.e., you don't see
anything in Event Viewer). If it does, you can define a scheduled event
in Task Scheduler to run the backup job that triggers on the exit event
of Tbird.

Yousuf Khan[_2_] April 27th 20 10:17 PM

Why is this folder so slow?
 
On 4/27/2020 2:03 PM, VanguardLH wrote:
What did you use to check if there were junctions defined within the
folder? For example, you could use Nirsoft's NTFSLinksView tool to scan
for junctions to list them. You can specify the start folder from where
to search, like the folder with the 500K+ files, or search from the root
folder of a drive (junctions cannot point to other drives). Alas, if
you pick the problematic folder, a scan will only show any junctions in
that folder, not those that point at that folder. You might want to
scan from the root folder, and then check if that folder is under a
junction. Windows has been using junctions for a long time, especially
when Microsoft decides to change the name of the special folder, like
changing "Documents and Settings", the old name, and "Documents", that
both point to C:\Users. Could be your problematic folder is under a
junction, like Documents.


I don't have to look for junctions, I know where they are. If there were
junctions here, I would have put them in myself, otherwise they aren't
there.

Yousuf Khan

Yousuf Khan[_2_] April 27th 20 10:22 PM

Why is this folder so slow?
 
On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.


Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?

Yousuf Khan

Yousuf Khan[_2_] April 27th 20 10:44 PM

Why is this folder so slow?
 
On 4/27/2020 2:29 PM, VanguardLH wrote:
As a test, disable your anti-virus software and run your TB data-only
backup job.


Yes, that's been done years ago too. This folder has been a major
headache for years now. And at one time, I found that the AV software
spending tons of time scanning this folder too, so I put an exclusion in
it for this folder. The AV doesn't ever scan in this folder anymore.

As another test, make sure to*exit* Thunderbird (check there are no
instances of TB in Task Manager's Processes tab), and check if the
backup job is just as slow.


Yeah, but it doesn't matter, Thunderbird's email folders don't suffer
from this problem. So even if Thunderbird were running in the
background, and even if it were VSS aware, then this problem would be
happening during backups of the email store as well, but it's only
happening in the newsgroup store. The email store is much, much more
active than the newsgroup store, but emails aren't affected, just
newsgroups.

VSS will encounter problems with databases that are not VSS aware.
Microsoft's SQL Server is VSS aware, but others are not. The
recommendation in backup programs, even those using VSS, for database
programs that are not VSS aware is to schedule their shutdown before the
backup, schedule the backup while the database program is down, and
restart the database program after the backup finishes. While this can
be done using Task Scheduler using event triggers (provided the database
program issues an event on shutdown), it's a pain to figure out the
script-like code you have to use to define for the trigger of the
scheduled event. There are schedulers that are more flexible that can
make their events dependent: task 3 runs only after task 2 ran and
returned good status which runs only after task 1 completed and returned
good status.


Thunderbird never downloads newsgroup messages in the background, like
it does with email, it only downloads them when you explicitly open the
newsgroups account. This is also related to what I said above about how
much more busier the Thunderbird email store is compared to the
newsgroup store. Thunderbird may be doing things in the background but
only with email.

It's not related to VSS, I've already given you the most likely cause of
the problem: there are over half million files, and each file is
inefficiently taking up little over half of the NTFS cluster, rather
than spreading a lesser number of files over many clusters. The real
question is how can we make NTFS more efficient at handling all of these
little files? NTFS is great at handling big files, but tiny little files
no so much.

Yousuf Khan

T[_6_] April 27th 20 11:15 PM

Why is this folder so slow?
 
On 2020-04-26 18:24, Yousuf Khan wrote:
I have a folder on one of my SSD drives that takes 8 to 10 hours to back
up. It is only about 1.4 GB, but it is allocated 2.4 GB of space
altogether, and there are 580,000 files here. Indicates that per file
it's using up a little bit over half of a cluster on average. File
system is NTFS.

Meanwhile, this same drive can backup the remainder of the drive in
under 2 hours, and the remainder of the drive is 390 GB! Is NTFS this
inefficient for small files like this?

Â*Â*Â*Â*Yousuf Khan


Hi Yousuf,

When I see things like this, it is usually a failing
drive, especially when the index on teh offending
directory never finishes.

This will show up like a soar thumb if yo run your
drive through gsmartcontrol: check the error logs and
run the self tests

http://gsmartcontrol.sourceforge.net....php/Downloads

Get back to us!

-T




T[_6_] April 27th 20 11:16 PM

Why is this folder so slow?
 
On 2020-04-27 07:30, Ken Blake wrote:
SinceÂ*theÂ*folderÂ*isÂ*onÂ*anÂ*SSD,Â*fragmentatio nÂ*shouldn'tÂ*makeÂ*anyÂ*difference.


And you will reduce your wear life doing a defragment


Paul[_28_] April 28th 20 04:46 AM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.


Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?

Yousuf Khan


These are examples of the setting in the Config Editor.

Note that the GUI "Server Settings" page has the choice grayed
out once the tool is running, implying these can't be switched
on the fly from the GUI.

And changing them here, doesn't mean a "converter" is going to run,
because the tool isn't going to know the "before" and "after"
and figure out what needs to be done, or whether it should
even be doing it.

mail.server.server1.storeContractID = @mozilla.org/msgstore/maildirstore;1

mail.server.server4.storeContractID = @mozilla.org/msgstore/berkeleystore;1

You could try some sort of Import/Export strategy, pulling from
an EML format setup, into an MBOX format setup.

Berkeleystore, as far as I know, is the "file per box" method.
The so-called Mork Storage Format, of which there is
a rudimentary parser available.

Maildirstore, is a file per message method, like an EML at a guess.
You can see this better than I can, as mine are all
going to be Berkeleystore.

Picture of Version 45 or so. Option not available/implemented
before Version 38.

https://i.postimg.cc/Jh7qmhzN/TBird-stores-option.gif

Paul

Paul[_28_] April 28th 20 05:02 AM

Why is this folder so slow?
 
T wrote:
On 2020-04-27 07:30, Ken Blake wrote:
Since the folder is on an SSD, fragmentation shouldn't make any difference.


And you will reduce your wear life doing a defragment


One way to do this, is with a Macrium backup and restore,
where you use the forward and back button, go back and
"edit" the size of the destination directory. This
causes the restoration to change restore mode, and it
seems to do a file-by-file write when challenged with
even an insignificant file system size change. You don't
have to "pinch it", and in fact pinching it is not
recommended. Like, make the partition 1MB smaller,
should be enough to trigger file-by-file write mode.

This results in a "mostly defragmented" disk. Due to the
handling of the $MFT and the reserved space for $MFT,
there is some "friction fragmentation" as the reserved
space gets squeezed. I can see a little bit of
fluff that doesn't get fixed. The end result of
changing the partition size on the Macrium restore,
is a mostly defragmented partition.

To save on your SSD while doing experiments like this,
you can test on hard drives, and evaluate using nfi.exe
(mentioned in the thread already).

This approach also places an upper bound on the number
of writes and the amount of flash life you're paying
for the privilege. The Windows 10 Defragmenter is pretty
good, but some of the other defragmenters out there,
they can run all night, and that can't be good for an
SSD.

https://i.postimg.cc/Y0TCt8K5/macrium-as-defragger.gif

I noticed that "side effect" one day after a restore,
where I'd changed the destination partition size. I couldn't
believe what I was seeing. Doing that to the 1.4TB
partition in that picture was mostly a joke, as on
a data partition, you don't really need to do that.
I wanted to see whether it would change the symptoms
of another bug I'm working on (and it didn't help).

Paul

VanguardLH[_2_] April 28th 20 06:40 AM

Why is this folder so slow?
 
Yousuf Khan wrote:

On 4/27/2020 2:29 PM, VanguardLH wrote:
As a test, disable your anti-virus software and run your TB data-only
backup job.


Yes, that's been done years ago too. This folder has been a major
headache for years now. And at one time, I found that the AV software
spending tons of time scanning this folder too, so I put an exclusion in
it for this folder. The AV doesn't ever scan in this folder anymore.


I just thought of something else: is that flagged as a special folder?
Right-click on the folder, and select Properties. Is there a Customize
tab? If so, select it, and check the setting for "Optimize this folder
for". Set to "General items" (instead of "Pictures").

It's not related to VSS, I've already given you the most likely cause of
the problem: there are over half million files, and each file is
inefficiently taking up little over half of the NTFS cluster, rather
than spreading a lesser number of files over many clusters. The real
question is how can we make NTFS more efficient at handling all of these
little files? NTFS is great at handling big files, but tiny little files
no so much.


Slack space is also a problem with FAT16/32, ext, or other file systems
where AUs (Allocation Units) are clusters or groups of sectors. The
file system will allocate a number of clusters that will encompass the
size of the file, but will be equal to or larger than the file's
content. Slack space is *not* just an NTFS problem.

For NTFS, files under the size for an MFT's file record are stored
inside the MFT since there is already enough space to hold the file
there. Instead of the MFT file record having a pointer to the small
file outside the MFT where there would be a lot of slack space (the
small file is nowhere the size of a cluster), the MFT file record *is*
the file.

An MFT file is 1 KB in size. If the file is smaller than that, the file
is stored in the MFT record. Actually, because the MFT file record has
a fixed 42-byte table at its start and holds file name and system
attributes.

https://hetmanrecovery.com/recovery_...ucture.htm#id4
According to specifications, MFT record size is determined by the
value of a variable in the boot sector. In practical terms, all
current versions of Microsoft Windows are using records sized 1024
bytes. The first 42 bytes store the header. The header contains 12
fields. The other 982 bytes do not have a fixed structure, and are
used to keep attributes.

The MFT is not infinite in size. NTFS has a limit of 4,294,967,295
files per disk (well, per volume). Your 580,000 files is only 0.01% of
NTFS' capacity for file count. Obviously there are lots of files
elsewhere in that volume.

NTFS doesn't have a problem between small and large files regarding
addressing them. It's the level of fragmentation that cause a problem.
Yeah, you think you don't need to defragment and should not defragment
an SSD because, after all, accessing memory at one address is the same
speed as accessing other memory. However, NTFS cannot support an
infinite chain of fragments for a file. Each fragment consumes an
extended file record in the MFT (a record outside the MFT). There are
limitations in every file system. Around 1.5 million fragments is the
limit per file under NTFS.

Doesn't Thunderbird have a compaction function? Used it yet? I don't
know if that will eliminate any fragmention of the files used to store
the messages or articles which, as I recall, are stored as seperate
files instead of inside a database, but I haven't used TB in a long
time.

Users don't think they ever need to defragment an SSD. All those extra
writes with no effective change in data content reduces the lifespan of
the SSD (writes are destructive). Sure, when there are few or dozen
fragments then the extra writes to defragment are wasting the SSD. It
takes time to chain from the MFT's record and through every external
extended record (which consumes space in the file system) to build up
the entire file. It's not one lookup in the MFT for the file. It's a
chained lookup for every fragment. IOPS will increase as fragmentation
increases, and perhaps why you are seeing high CPU usage when backing up
those files. Most users think of fragmentation as a performance issue
with moving physical media, like hard disks. Fragmentation ON ANY MEDIA
is still an I/O overhead issue and inflates the IOPS to process them
all. Yes, there is a limit in NTFS to the number of fragments that a
file may have, but the more fragments there are the more space is
consumed in the file system to track those fragments and the more CPU
consumed to process the fragments. When an OS sees a file comprised of
multiple fragments, there are more multiple I/O operations to process
the whole file. If Windows see 20 pieces at the logical layer, there
are 20 I/O operations to process the whole file as a read or write.

Fragmentation is not just a performance issue at the physical layer. It
is also a performance factor at the logical layer (file system).
Extreme fragmentation requires lots of repeated writes to a file. I
don't know what you've been doing with those files in the problematic
folder. If they are photos, you rarely edit those, just copy them.

Similarly, for a backup job, it has to perform the IOPS'es needed to
read all the files included in the backup. I have under 400,000 files
on my entire OS+app drive (which is a partition spanning the entire
SSD). You have more than that in one folder. From your description,
the backup job is CPU bound with all those IOPS. Do you really have
over 500K files in just one folder? You never considered creating a
hierarchical structure of subfolders to hold groups of those files based
on a common criteria for each subfolder? Just because you can dump
hundreds of thousands of files into a single folder doesn't mean that's
a good behavior.

By the say, in Macrium Reflect, did you configure your backup job to
throttle its use of the CPU? That's to prevent a backup job from
sucking up all the CPU while preventing the computer being usable to the
user during the backup. In a Reflect backup job, you can configure its
priority. Well, if you set it at max (which is still, I believe, less
than real-time priority), that process sucks up most of the CPU and
leave little for use by other processes making the computer unusable to
you. Even if you schedule the backup to run when you're not at the
computer, other backgrounded processes, like your startup programs, and
even the OS want some CPU slices.

The compression you select for a backup job also dictates how much it
consumes the CPU. You will find very little difference in the size of
the backup file between Medium (recommended) and High compression
levels. The backup job will take a lot longer trying to compress more
the backup file, but the result is little improvement in reduction of
the backup file, especially for non-compressible file formats, like
images, but wastes a lot of CPU time for insignificant gain.

I did not find an option in Reflect to throttle how much bandwidth it
uses on the data bus, like a limit on IOPS. Not for network traffic,
but how busy it keeps the data bus. If it is flooded, and especially
for a high[er] priority process, you have to wait to do any other data
I/O.

Frank Slootweg April 28th 20 04:13 PM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.


Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?


It's not a new News file format, it's a different format.

You set the format in the News account: Tools - Account settings -
your news account in the left pane - 'Server Settings' page -
Message Storage - Message Store Type:. This field *should* be set to
'File per folder (mbox)'. Yours is probably set to 'File per message
(maildir)'.

For your account - i.e. an *existing* account - you probably cannot
change this setting, i.e. you can only set it when you create the
account. Hence my comment about exporting and (re-)importing. If you
cannot change the setting, you will have to export all the articles from
your current account and then re-import all articles into a new account
with 'Message Store Type: File per folder (mbox)'.

The basic Thunderbird program has no export facility and only very
limited import functionality.

For import of e-mail (from Windows Mail), I have used the Thunderbird
ImportExportTools [1] Extension, but I have not used it for News and not
for export.

ImportExportTools can export on a per-folder basis, so you could try
to export just one folder/newsgroup and then import it into a new
account to see if it works for News. Exporting is a copy-type operation,
i.e. the source remains untouched, and if you import to a *new* account,
the old account remains untouched. IOW, it's a totally safe operation.

If ImportExportTools can not solve your problem, you'll probably have
to search the Thunderbird support site(s)/forum(s) or/and post there.

[1]
https://addons.thunderbird.net/en-GB/thunderbird/addon/importexporttools/?src=userprofile

Paul[_28_] April 28th 20 08:14 PM

Why is this folder so slow?
 
Frank Slootweg wrote:
Yousuf Khan wrote:
On 4/27/2020 12:04 PM, Frank Slootweg wrote:
If there are 580,000 files in the News folder, then you've probably
configured your Thunderbird News account(s) to use one file for each
article instead of one file for each newsgroup.

If so, it's probably best to bite the bullet and convert to one file
per newsgroup. That probably needs an export and (re-)import and
probably will be time-consuming, but at least then you'll solve the
actual problem.

FYI, my setup - not Thunderbird - has nearly a million articles, but
only some 600 files.

Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?


It's not a new News file format, it's a different format.

You set the format in the News account: Tools - Account settings -
your news account in the left pane - 'Server Settings' page -
Message Storage - Message Store Type:. This field *should* be set to
'File per folder (mbox)'. Yours is probably set to 'File per message
(maildir)'.

For your account - i.e. an *existing* account - you probably cannot
change this setting, i.e. you can only set it when you create the
account. Hence my comment about exporting and (re-)importing. If you
cannot change the setting, you will have to export all the articles from
your current account and then re-import all articles into a new account
with 'Message Store Type: File per folder (mbox)'.

The basic Thunderbird program has no export facility and only very
limited import functionality.

For import of e-mail (from Windows Mail), I have used the Thunderbird
ImportExportTools [1] Extension, but I have not used it for News and not
for export.

ImportExportTools can export on a per-folder basis, so you could try
to export just one folder/newsgroup and then import it into a new
account to see if it works for News. Exporting is a copy-type operation,
i.e. the source remains untouched, and if you import to a *new* account,
the old account remains untouched. IOW, it's a totally safe operation.

If ImportExportTools can not solve your problem, you'll probably have
to search the Thunderbird support site(s)/forum(s) or/and post there.

[1]
https://addons.thunderbird.net/en-GB/thunderbird/addon/importexporttools/?src=userprofile


The one I was looking at the other day, said that it didn't
handle stuff in the News folder specifically.

As for the availability of the MailboxStore option in the
Server settings, the claim is that you must use this
immediately when the installation of Thunderbird is
brand new. In my experiments yesterday, I tried to "clean out"
my profile, and tried not to leave any .msf files, then
set the prefs.js with the maildirstore preference, and
that *still* wasn't enough to make it work. I'm going
to have to nuke the damn thing and start from scratch,
to see if I can get it to work.

One other weirdness from yesterdays experiment, is after
I was finished with my failed experiment, I took the ZIP
file holding my unbroken profile, and started to restore
it to my SSD drive. I was greeted by write rates of arounf
2MB/sec on my SSD. It took forever to restore the fleet
of .msf (file per box) style files. And when I opened
Task Manager, MsMpEng was railed on one core, scanning
everything being written into the profile area. I've done
plenty of other stuff on the computer, where it doesn't
do that with quite the same level of venom. (If I unpack
an .ova on a scratch drive, it does that at several hundred
megabytes per second. As if MsMpEng didn't care.)

Paul

Frank Slootweg April 28th 20 10:08 PM

Why is this folder so slow?
 
Paul wrote:
[...]

As for the availability of the MailboxStore option in the
Server settings, the claim is that you must use this
immediately when the installation of Thunderbird is
brand new.


I think that's not correct. The *installation* doesn't have to be
brand new, the *account* in Thunderbird must be new, i.e. just created.

I added a new New account and could set 'Message Store Type:' to
either 'File per folder (mbox)' or 'File per message (maildir)'.

In my experiments yesterday, I tried to "clean out"
my profile, and tried not to leave any .msf files, then
set the prefs.js with the maildirstore preference, and
that *still* wasn't enough to make it work. I'm going
to have to nuke the damn thing and start from scratch,
to see if I can get it to work.


There's no need to fiddle with preferences as there are perfectly good
settings in the GUI. I think the fiddling might actually have been
counter-productive, because you might have been setting a global default
instead of a account-specific setting.

[...]

Yousuf Khan[_2_] May 1st 20 07:57 AM

Why is this folder so slow?
 
On 4/27/2020 2:09 PM, Paul wrote:
OK, show me a chunk of nfi.exe output, just
for files in the magical folder. Just enough
to capture the essence of what's going on.


What I'm looking for is filesystem tuning advice, not process tuning
advice. Looking at CPU processes is just a wild goose-chase. Yes, the
CPU is being hammered, but we know which processes are responsible and
why, it's hardly surprising which ones they are (i.e. Macrium Reflect
binaries), so it's trivial.

Anyways, since I'm not getting that advice here, as it turned out, I
just received a new SSD (as an RMA of a previous SSD, which has already
been replaced). I decided to try a few tests myself. I set up the new
SSD as the Z drive, and I formatted it into non-default NTFS settings.
This SSD is usually default formatted to 4K blocks, I tested it out by
using 0.5K and 1K blocks instead. I then restored a previous backup of
the filesystem to this drive, and tested out the backup and restore
performance. Since this is not the production drive, it's not being
accessed by any other processes like Thunderbird, so it's pristine and
not a busy drive.

I found that with both 0.5K and 1K blocks, the restore operation went
very fast, about half an hour to get fully restored, which is a big
improvement (previously used to take 1.5 hours to restore). Also the
allocation slack was greatly improved, went from 1.4GB stored and 2.4GB
allocated (42% slack), to 1.4GB stored and only 1.5GB allocated (7%
slack). However, then I tried backing up the new drive and it still took
over 8 hours! So writing to the drive is getting very fast, but reading
off of it is still slow. Still the same number of files as before, over
half-a-million.

Yousuf Khan

Yousuf Khan[_2_] May 1st 20 08:15 AM

Why is this folder so slow?
 
On 4/28/2020 11:13 AM, Frank Slootweg wrote:
Yousuf Khan wrote:
Yes, that is exactly the problem, I was getting at. Does Thunderbird
have a new news file format available? My assumption was that
Thunderbird only does 1 file/message? What's the option to convert?


It's not a new News file format, it's a different format.


New to me. LOL ;-)

You set the format in the News account: Tools - Account settings -
your news account in the left pane - 'Server Settings' page -
Message Storage - Message Store Type:. This field *should* be set to
'File per folder (mbox)'. Yours is probably set to 'File per message
(maildir)'.


Actually it does show the "file per folder (mbox)" but it's completely
grey-out, unchangeable. I assume that that's just it's preferred method
of accessing News, but it's being forced to use the old format anyways.

For your account - i.e. an *existing* account - you probably cannot
change this setting, i.e. you can only set it when you create the
account. Hence my comment about exporting and (re-)importing. If you
cannot change the setting, you will have to export all the articles from
your current account and then re-import all articles into a new account
with 'Message Store Type: File per folder (mbox)'.


Actually, I'm ready to completely blow out all of the files in the news
folder, and redownload from scratch, just so long as my newsgroups list
remains untouched. I obviously have backups of it, so it's not going to
be harmful to me.

If ImportExportTools can not solve your problem, you'll probably have
to search the Thunderbird support site(s)/forum(s) or/and post there.

[1]
https://addons.thunderbird.net/en-GB/thunderbird/addon/importexporttools/?src=userprofile


Thanks, but it looks like this version only works up to Thunderbird
version 60, and I'm on version 68.x. No problem, this is just News, I'll
just wipe it out and redownload.

Yousuf Khan

Yousuf Khan[_2_] May 1st 20 08:20 AM

Why is this folder so slow?
 
On 4/28/2020 3:14 PM, Paul wrote:
The one I was looking at the other day, said that it didn't
handle stuff in the News folder specifically.

As for the availability of the MailboxStore option in the
Server settings, the claim is that you must use this
immediately when the installation of Thunderbird is
brand new. In my experiments yesterday, I tried to "clean out"
my profile, and tried not to leave any .msf files, then
set the prefs.js with the maildirstore preference, and
that *still* wasn't enough to make it work. I'm going
to have to nuke the damn thing and start from scratch,
to see if I can get it to work.

One other weirdness from yesterdays experiment, is after
I was finished with my failed experiment, I took the ZIP
file holding my unbroken profile, and started to restore
it to my SSD drive. I was greeted by write rates of arounf
2MB/sec on my SSD. It took forever to restore the fleet
of .msf (file per box) style files. And when I opened
Task Manager, MsMpEng was railed on one core, scanning
everything being written into the profile area. I've done
plenty of other stuff on the computer, where it doesn't
do that with quite the same level of venom. (If I unpack
an .ova on a scratch drive, it does that at several hundred
megabytes per second. As if MsMpEng didn't care.)

Â*Â* Paul


Oh, it's a good thing I kept reading the replies, as it looks like you
already tried what I was about to try. So it kept using the same file
format as before, even after nuking it and starting from scratch?

Yousuf Khan[_2_] May 1st 20 08:22 AM

Why is this folder so slow?
 
On 4/28/2020 5:08 PM, Frank Slootweg wrote:
I think that's not correct. The*installation* doesn't have to be
brand new, the*account* in Thunderbird must be new, i.e. just created.

I added a new New account and could set 'Message Store Type:' to
either 'File per folder (mbox)' or 'File per message (maildir)'.


So what if I nuke all of the old messages in the News folder, and let it
repopulate from scratch?

Yousuf Khan

Yousuf Khan[_2_] May 1st 20 08:24 AM

Why is this folder so slow?
 
On 4/27/2020 6:15 PM, T wrote:
Hi Yousuf,

When I see things like this, it is usually a failing
drive, especially when the index on teh offending
directory never finishes.

This will show up like a soar thumb if yo run your
drive through gsmartcontrol: check the error logs and
run the self tests


Brand new drive, less than a month old, hasn't had a chance to get old yet.

Yousuf Khan

Paul[_28_] May 1st 20 08:56 AM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 4/28/2020 3:14 PM, Paul wrote:
The one I was looking at the other day, said that it didn't
handle stuff in the News folder specifically.

As for the availability of the MailboxStore option in the
Server settings, the claim is that you must use this
immediately when the installation of Thunderbird is
brand new. In my experiments yesterday, I tried to "clean out"
my profile, and tried not to leave any .msf files, then
set the prefs.js with the maildirstore preference, and
that *still* wasn't enough to make it work. I'm going
to have to nuke the damn thing and start from scratch,
to see if I can get it to work.

One other weirdness from yesterdays experiment, is after
I was finished with my failed experiment, I took the ZIP
file holding my unbroken profile, and started to restore
it to my SSD drive. I was greeted by write rates of arounf
2MB/sec on my SSD. It took forever to restore the fleet
of .msf (file per box) style files. And when I opened
Task Manager, MsMpEng was railed on one core, scanning
everything being written into the profile area. I've done
plenty of other stuff on the computer, where it doesn't
do that with quite the same level of venom. (If I unpack
an .ova on a scratch drive, it does that at several hundred
megabytes per second. As if MsMpEng didn't care.)

Paul


Oh, it's a good thing I kept reading the replies, as it looks like you
already tried what I was about to try. So it kept using the same file
format as before, even after nuking it and starting from scratch?


I would refrain from working in this direction.

Sure, if you have backed up the various folders for TBird
before trying it (like I did when testing), then great.
Just don't do it, without having something to restore from.

It's pretty weird for a function to be existing in TBird
and presumably to be absorbing test time from release to
release, and then be hobbling the usage of it with
inept controls.

If you pursue this line of reasoning, what will
happen is your headers will be stripped down to
the event horizon of the server (maybe six months
retention on a free server), and if you have
years of headers (where the MID won't fetch anything
if you click), those are the kinds of headers that
will disappear if you start over again. The headers
from ten years ago, aren't on the server, and cannot
be regenerated from a small server - messing around
will significantly damage your header history.

If the damn thing had a conversion function that
converted equally between the two formats, I might
have a different opinion about doing this. It's just
that this is a feature that doesn't appear finished.

Paul


Yousuf Khan[_2_] May 1st 20 09:13 AM

Why is this folder so slow?
 
On 5/1/2020 3:56 AM, Paul wrote:
Yousuf Khan wrote:
Oh, it's a good thing I kept reading the replies, as it looks like you
already tried what I was about to try. So it kept using the same file
format as before, even after nuking it and starting from scratch?


I would refrain from working in this direction.

Sure, if you have backed up the various folders for TBird
before trying it (like I did when testing), then great.
Just don't do it, without having something to restore from.

It's pretty weird for a function to be existing in TBird
and presumably to be absorbing test time from release to
release, and then be hobbling the usage of it with
inept controls.


I just took a chance, and deleted all of the old newsgroup folders, that
contained all of the old-style messages. Left all of the rest of the
files in that news server's base folder untouched. Then I started
Thunderbird up again. It re-downloaded the messages, and it only
downloaded from where I last left off. It's now filling the data files
known as *.msf (e.g. alt.comp.os.windows-10.msf) rather than filling the
folders! Interestingly, these *.msf files used to exist in this News
folder before, but they were just trivial 1K or 2K files, with nothing
substantial inside them. They are now substantial files now, ranging
from 44 KB to 41 MB. So it looks like having those old folders there all
of this time was preventing Thunderbird from using the new style *.msf
files, even though it had long ago created them!

Yousuf Khan

Paul[_28_] May 1st 20 10:11 AM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 5/1/2020 3:56 AM, Paul wrote:
Yousuf Khan wrote:
Oh, it's a good thing I kept reading the replies, as it looks like
you already tried what I was about to try. So it kept using the same
file format as before, even after nuking it and starting from scratch?


I would refrain from working in this direction.

Sure, if you have backed up the various folders for TBird
before trying it (like I did when testing), then great.
Just don't do it, without having something to restore from.

It's pretty weird for a function to be existing in TBird
and presumably to be absorbing test time from release to
release, and then be hobbling the usage of it with
inept controls.


I just took a chance, and deleted all of the old newsgroup folders, that
contained all of the old-style messages. Left all of the rest of the
files in that news server's base folder untouched. Then I started
Thunderbird up again. It re-downloaded the messages, and it only
downloaded from where I last left off. It's now filling the data files
known as *.msf (e.g. alt.comp.os.windows-10.msf) rather than filling the
folders! Interestingly, these *.msf files used to exist in this News
folder before, but they were just trivial 1K or 2K files, with nothing
substantial inside them. They are now substantial files now, ranging
from 44 KB to 41 MB. So it looks like having those old folders there all
of this time was preventing Thunderbird from using the new style *.msf
files, even though it had long ago created them!

Yousuf Khan


What I had tested before, was TBird 45 (sufficiently newer than the
TBird 38 that launched maildir). There was no conversion claimed
in TBird 45.

I was just looking at TBird 60.9.1 in a VM here (a setup that's
only used for email testing), and I added a news server to it,
and not only did if offer the button to choose .msf versus
maildir, but when I selected maildir, it claimed to be
"doing a conversion" to the other format. Even though
at that moment, no groups existed.

I added one group, and again it claimed to be doing a
conversion, and now there's a parallel "maildir" folder
which presumes to be a copy of the .msf folder.

If you kept your original setup with the 500000 files,
you might try updating to 60.9.1 or so, and trying
to flip the control using that version. It seemed to
unsubscribe me from the one group I'd selected, but
it seems to have worked. I haven't had time to do much
other testing yet.

Paul

Frank Slootweg May 1st 20 03:55 PM

Why is this folder so slow?
 
Yousuf Khan wrote:
[...]

I just took a chance, and deleted all of the old newsgroup folders, that
contained all of the old-style messages. Left all of the rest of the
files in that news server's base folder untouched. Then I started
Thunderbird up again. It re-downloaded the messages, and it only
downloaded from where I last left off. It's now filling the data files
known as *.msf (e.g. alt.comp.os.windows-10.msf) rather than filling the
folders! Interestingly, these *.msf files used to exist in this News
folder before, but they were just trivial 1K or 2K files, with nothing
substantial inside them. They are now substantial files now, ranging
from 44 KB to 41 MB. So it looks like having those old folders there all
of this time was preventing Thunderbird from using the new style *.msf
files, even though it had long ago created them!


boggle!

If you apparently did not mind to delete all the old articles, then
why did you keep 580,000 old articles in the first place!?

You can set global and per group retention policies, so if you do not
need so much articles, just set those to appropriate values.

Yousuf Khan[_2_] May 1st 20 05:21 PM

Why is this folder so slow?
 
On 5/1/2020 10:55 AM, Frank Slootweg wrote:
boggle!

If you apparently did not mind to delete all the old articles, then
why did you keep 580,000 old articles in the first place!?


Simple, because I had no idea what the purpose of any of these files in
this folder were for in any detail, what was important, and where
exactly data resided, so I just backed up everything. That way I
wouldn't have to recreate everything from scratch, and go through hours
of debugging. I've had situations were just 1 important file goes
missing which screws up the entire configuration, and trying to find
that one missing file among half million is a needle in a haystack.

So now after the deletion, I'm down from half million to only about 600
files. And I did a test backup, and the backup went from over 8 hours,
down to only 2.5 minutes! My feeling is that perhaps a lot of those
half-million files were just left over from decades of junk that
Thunderbird did not clear, even though it said it was clearing them.

Yousuf Khan

Yousuf Khan[_2_] May 1st 20 05:26 PM

Why is this folder so slow?
 
On 5/1/2020 5:11 AM, Paul wrote:
If you kept your original setup with the 500000 files,
you might try updating to 60.9.1 or so, and trying
to flip the control using that version. It seemed to
unsubscribe me from the one group I'd selected, but
it seems to have worked. I haven't had time to do much
other testing yet.


Well, I have been completely uptodate on the Thunderbird releases for a
while now. I was running 68.7, even before this.

I think what's happening here is that Thunderbird wasn't expecting there
to be such long-term'ers like me continuously using their product. I'd
been using Thunderbird since version 0.something, and what was around
back then, is not what is around now, and so they never expected that
I'd be around since back then, and they had no plans for how to migrate
old-timers like me. So they just kept using the old format files in my
setup, even though the new format already existed, but they just ignored it.

Yousuf Khan

Frank Slootweg May 1st 20 06:52 PM

Why is this folder so slow?
 
Yousuf Khan wrote:
On 5/1/2020 10:55 AM, Frank Slootweg wrote:
boggle!

If you apparently did not mind to delete all the old articles, then
why did you keep 580,000 old articles in the first place!?


Simple, because I had no idea what the purpose of any of these files in
this folder were for in any detail, what was important, and where
exactly data resided, so I just backed up everything. That way I
wouldn't have to recreate everything from scratch, and go through hours
of debugging. I've had situations were just 1 important file goes
missing which screws up the entire configuration, and trying to find
that one missing file among half million is a needle in a haystack.


I can - sort of - understand that, but because these 580,000 were
giving you so much hardship, I would have expected you to look at a
few of them, see that they were just News articles and take it from
there, i.e. set/lower the News retention settings in Thunderbird.

So now after the deletion, I'm down from half million to only about 600
files. And I did a test backup, and the backup went from over 8 hours,
down to only 2.5 minutes! My feeling is that perhaps a lot of those
half-million files were just left over from decades of junk that
Thunderbird did not clear, even though it said it was clearing them.


What you saw about "clearing" (the actual term is 'Compact'(ing)) is
for e-mail, not for News. This was already mentioned in this thread,
IIRC by VanguardLH. E-mail folders need to be compacted, because you
might delete some messages from a folder, so the .msf file needs to be
compacted to recover the space occupied by the deleted messages. News
articles can be deleted as well (in Thunderbird), but most people won't,
because there's no point, because you can only delete your *copy*, not
the copies on the rest of The Net.

Anyway, you should probably set the (News) retention settings,
otherwise the storage will grow again without bounds, not not in number
of files, but in number of MBs/GBs.

Yousuf Khan[_2_] May 1st 20 10:36 PM

Why is this folder so slow?
 
On 5/1/2020 1:52 PM, Frank Slootweg wrote:
I can - sort of - understand that, but because these 580,000 were
giving you so much hardship, I would have expected you to look at a
few of them, see that they were just News articles and take it from
there, i.e. set/lower the News retention settings in Thunderbird.


No, I knew those were the message files, considering that there were so
many of them, what else could they have been? But often there are other
files interspersed among them, that can often go overlooked because it's
overwhelmed by the mass of all of the main files. Just let the backup
software handle backing all of it up.

Yousuf Khan


All times are GMT +1. The time now is 04:11 PM.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
HardwareBanter.com