FilterHN

Copy-Item is slower than File Explorer

61 points

by hiAndrewQuinn

8 hours ago

| past

| 16 comments

| til.andrew-quinn.me

| HN

▲

chihuahua

6 hours ago

[-]

To properly appreciate a post like this one, it should ideally be paired with a Raymond Chen post that argues in Hercule Poirot style irrefutable logic how a combination of backwards-compatibility with CP/M and some 1990s programming done by raccoons means that this messed up state of affairs is logically the only way it could possibly be.

▲

jodrellblank

1 hour ago

[-]

Raymond Chen from 2004, slightly unsatisfying[1]. Try Mark Russinovich post with technical details about rewriting Explorer's copy routines from XP to Vista[2].

Windows 8 team blog on updating the copy dialog UX[3].

[1] https://devblogs.microsoft.com/oldnewthing/20040106-00/?p=41...

[2] https://web.archive.org/web/20120608005952/http://blogs.tech...

[3] https://learn.microsoft.com/en-us/archive/blogs/b8/improving...

▲

r1ch

7 hours ago

[-]

OP mentions using "Cat 7" cables - please don't buy these. Cat 7 isn't something that exists in TIA/EIA standards, only in ISO/IEC and it requires GG45 or TERA connectors. Cat 7 with RJ45 connectors isn't standardized, so you have no idea what you're actually getting. Stick with pure copper Cat 6A.

▲

Arrowmaster

36 minutes ago

[-]

Absolutely agreeing with you but replying to you instead of multiple others below with my views on this.

Cat6A can do 10Gbps at 100m. Cat7 and Cat8 can do higher speeds in short runs but those technologies are DEAD in DC tech now. 40G is legacy tech using four lanes of 10G, replaced by 100G which is four lanes of 25G. Copper patch cables are not used with these, everything is fiber or DAC.

If you use a Cat7 or Cat8 cable the higher MHz support listed on the spec will never be used. When using a real cable of these qualities all you are really getting is better protection from outside interference.

When buying premade patch cables only buy Cat6A. Anything you see online saying Cat7 or Cat8 has probably never been properly tested by the manufacturer.

When buying a spool of wire do your research on the manufacturer. There's plenty with false labels out there. I once saw a spool of 'Cat6e' which is not a real standard.

When paying others to run cables find out what brand and what warranty the installer is providing. If they only use Cat7 and cannot provide a good explanation on why they might not actually know as much as you should be expecting them to.

▲

kuschku

5 hours ago

[-]

Cat 7 with RJ45 sockets is standardized, which is ideal for running it in walls. Sure, you don't want to use it for patch cables, but over long runs in walls it's a great solution, especially due to the better shielding of Cat 7.

> so you have no idea what you're actually getting

As Cat 7 is only sold by the meter on a roll, you know just as much what you're getting as with Cat 6A, spec-compliant network cables.

▲

someguyiguess

6 hours ago

[-]

What about Cat 8? I know it’s not really used in consumer grade applications but is it in TIA/EIA standards?

▲

r1ch

6 hours ago

[-]

Yes, that's standardized but is only rated for up to 30 meters at the higher speeds you get from it, so it's not very useful outside of server room / data center applications and you probably want to be using fiber at that point.

▲

0xC0ncord

6 hours ago

[-]

For what it's worth, I recently bought a spool of CAT7 cable and a bunch of RJ45 connectors and made my own cables that perform well and reliably. I don't know if this was wise in the end but I was able to get what I needed out of it.

▲

kichik

7 hours ago

[-]

Invoke-WebRequest is also very slow if you forget to disable the progress bar with $ProgressPreference = 'SilentlyContinue'

PowerShell has some "interesting" design choices...

▲

Lariscus

7 hours ago

[-]

It also buffers the downloaded data completely into memory last time I checked. So downloading a file bigger than the available RAM just doesn't work and you have to use WebClient instead.

Another fun one is Extract-Archive which is painfully slow while using the System.IO.Compression.ZipFile CLR type directly is reasonably fast. Powershell is really a head scratcher sometimes.

▲

jeroenhd

6 hours ago

[-]

The download being cached in RAM kind of makes sense, curl will do the same (up to a point) if the output stream is slower than the download itself. For a scripting language, I think it makes sense. Microsoft deciding to alias wget to Invoke-WebRequest does make for a rather annoying side effect, but perhaps it was to be expected as all of their aliases for GNU tools are poor replacements.

I tried to look into the whole Expand-Archive thing, but as of https://github.com/PowerShell/Microsoft.PowerShell.Archive/c... I can't even find the Expand-Archive cmdlet source code anymore. The archive files themselves seem to have "expand" be unimplemented. Unless they moved the expand command to another repo for some reason, it looks like the entire command will disappear at one point?

Still, it does look like Expand-Archive was using the plain old System.IO.Compression library for its file I/O, though, although there is a bit of pre-processing to validate paths existing and such, that may take a while.

▲

mort96

6 hours ago

[-]

> curl will do the same (up to a point) if the output stream is slower than the download itself

That "up to a point" is crucial. Storing chunks in memory up to some max size as you wait for them to be written to disk makes complete sense. Buffering the entire download in memory before writing to disk at the end doesn't make sense at all.

▲

AHTERIX5000

6 hours ago

[-]

Yep. And 'wget' is often alias for WebRequest in PowerShell. The amount of footguns I ran into while trying to get a simple Windows Container CI job running, oh man

▲

ycombinatrix

6 hours ago

[-]

"curl" being aliased to "Invoke-WebRequest" is also a massive dick move

▲

jodrellblank

51 minutes ago

[-]

It's a completely new shell, new commands for everything, no familiar affordances for common tasks, so they add user-configurable, user-removable aliases from DOS/macOS/Linux so that people could have some on-ramp, something to type that would do something. That's not a dick move at all, that's a helpful move.

Harassing the creator/team for years because a thing you don't use doesn't work the way you want it to work? That is.

They removed it in PowerShell core 9 years ago! 9 years! And you're still fixated on it!

▲

ycombinatrix

8 minutes ago

[-]

It is still present in powershell on my up to date windows 11 machine today, so it is disingenuous for you to claim the alias was removed 9 years ago.

The alias confuses people that are expecting to run curl when they type "curl" (duh) and also causes headaches for the actual curl developers, especially when curl is actually installed!

Why the hostile tone? Pretty rude of you to claim I'm fixated on the issue for years and harassing the powershell development team with zero evidence.

▲

pixl97

6 hours ago

[-]

yea, curl.exe and curl are two different commands on windows. Fun stuff.

▲

jodrellblank

47 minutes ago

[-]

Not "on Windows". In PowerShell 5. PowerShell core removed the curl alias 9 years ago.

▲

fourthark

15 minutes ago

[-]

But PowerShell 5.1 is still the one that ships with Windows.

▲

DHowett

6 hours ago

[-]

tar.exe, however, beats both of those in terms of archive format support and speed.

▲

coffeeaddict1

6 hours ago

[-]

I think that's only an issue with Windows Powershell. Powershell 7 works just fine.

▲

orthoxerox

7 hours ago

[-]

Wasn't something like npm much slower as well when it showed a progress indicator by default?

▲

archi42

7 hours ago

[-]

This is atrocious. I get it, some things are less trivial than they seem - but I would be ashamed for shipping something like this, and even more for not fixing it.

▲

Almondsetat

6 hours ago

[-]

Came here to post this, and it's even more egregious when you realize curl is an alias for Invoke-WebRequest

▲

pixl97

6 hours ago

[-]

Be nice if you listed any particular settings on the commands....

For robocopy, for example, if you're copying small files/bunch of directories use the /MT:$number flag. It's so much massively faster it's not like the same application.

Also is this a newer version of Windows that supports smb3, Explorer is likely using that to copy more in parallel.

▲

hiAndrewQuinn

12 minutes ago

[-]

For robocopy I realized only after pushing the article up that I let out I was using /MT:32. I put this in a comment at the bottom of the page since I was short on time and energy.

▲

charcircuit

6 hours ago

[-]

>no scripting options at all on Windows that come close to the fearsome power of File Explorer’s copy and paste out of the box

You can use Powershell.

  $shell = New-Object -ComObject Shell.Application
  $shell.Namespace("C:\Source").ParseName("myfile").InvokeVerb("copy")
  $shell.Namespace("C:\Destination").Self.InvokeVerb("paste")

▲

hugh-avherald

5 hours ago

[-]

It's not the most succinct language I'll give it that.

▲

valiant55

4 hours ago

[-]

Copy-Item is a cmdlet, the native way to do it in PowerShell. Gp posted a hack to replicate the GUI in PowerShell.

▲

jiggawatts

4 hours ago

[-]

But it is readable, which is much more important.

Even someone who has never seen a Windows PC in their life could guess what this script does.

Linux and Unix shell commands use completely arbitrary single letter parameters that must be looked up or memorised. That’s not a virtue.

▲

pathartl

2 hours ago

[-]

Not to mention, everything-is-an-object is a much better experience than text only.

▲

cheema33

7 hours ago

[-]

I am not surprised. My Windows 11 systems with modern and beefy hardware frequently runs very slow for reasons unknown. I did use https://github.com/Raphire/Win11Debloat recently and that seemed to have helped. Windows by default comes with a lot of crap that most of us do not use but it consumes resources anyway.

I have been considering a move back to Linux. It is only Microsoft Teams on Windows that I have to use daily that is holding me back.

▲

steve1977

34 minutes ago

[-]

> It is only Microsoft Teams on Windows that I have to use daily that is holding me back.

I‘m very sorry.

▲

mft_

7 hours ago

[-]

> I have been considering a move back to Linux. It is only Microsoft Teams on Windows that I have to use daily that is holding me back.

Me too. I've not tried this yet, but will soon: https://github.com/IsmaelMartinez/teams-for-linux

▲

mgerdts

5 hours ago

[-]

Robocopy has options for unbuffered IO (/J) and parallel operations (/MT:N) which could make it go much faster.

Performing parallel copies is probably the big win with less than 10 Gb/s of network bandwidth. This will allow SMB multichannel to use multiple connections, hiding some of the slowness you can get with a single TCP connection.

When doing more than 1-2 GB/s of IO the page cache can start to slow IO down. That’s when unbuffered (direct) IO starts to show a lot of benefit.

▲

DustinEchoes

8 hours ago

[-]

Never assume anything done in Powershell is fast.

▲

sgc

7 hours ago

[-]

It's fortunately been years since I have used Windows, but it looks like the old staples are still ahead of the curve:

https://fastcopy.jp/

https://www.codesector.com/teracopy

(I have certainly forgotten at least one...)

▲

cm2187

6 hours ago

[-]

One thing I don't understand with Windows Server is that it seems no matter how fast the nvme drives I use, or I pair/pool, I can't get a normal file copy to go faster than around 1.5GB/s (that's local, no network). The underlying disks show multi GB/s performance under crystal disk mark. But I suspect something in the OS must get in the way.

▲

docjay

3 hours ago

[-]

Your system ~3+ years old or so? Your story screams DMI 3.0 or similar PCH/PCIe switch saturation issue. DMI 3.0 caps at ~3.5GB/s but about 1.5-1.7 when bidirectional, such as drive to drive. If 100% reads hit about 3.5 then you’ve all but confirmed it. ~1.5GB/s bidirectional is a super common issue for a super common hardware combination.

It’ll happen if your U.2 ports route through DMI 3.0 PCH/Chipset/PCIe switch rather than directly to the CPU PCIe lanes. Easiest to just check motherboard manual, but you can use hwinfo to inspect the PCI tree and see if your U.2 ports are under a “chipset” labeled node. You might have different ports on the mobo that are direct, or possibly bios changes to explore. Sometimes lots of options, sometimes none. Worst case a direct PCIe adapter will resolve it.

▲

mgerdts

5 hours ago

[-]

In addition to my other comments about parallel IO and unbuffered IO, be aware that WS2022 has (had?) a rather slow NVMe driver. It has been improved in WS2025.

▲

jiggawatts

4 hours ago

[-]

I just benchmarked this to death using a 24-core VM with two different kinds of NVMe storage.

Windows Server 2025 is somewhat better on reads but only at low parallelism.

There’s no difference on writes.

▲

g-mork

6 hours ago

[-]

If it's over SMB/Windows file sharing then you might be looking at some kind of latency-induced limit. AFAIK SMB doesn't stream uploads, they occur as a sequence of individual write operations, which I'm going to guess also produce an acknowledgement from the other end. It's possible something like this (say, client waiting for an ACK before issuing a new pending IO) is responsible

What does iperf say about your client/server combination? If it's capping out at the same level then networking, else something somewhere else in the stack.

I noticed recently that OS X file IO performance is absolute garbage because of all the extra protection functionality they've been piling into newer versions. No idea how any of it works, all I know is some background process burns CPU just from simple operations like recursively listing directories

▲

cm2187

6 hours ago

[-]

The problem I describe is local (U.2 to U.2 SSD on the same machine, drives that could easily performs at 4GB/s read/write, and even when I pool them in RAID0 in arrays that can do 10GB/s).

Windows has weird behaviors for copying. Like if I pool some SAS or NVMe SSD in storage space parity (~RAID5) the performance in CrystalDiskMark is abyssal (~250MB/s) but a windows copy will be stable at about 1GB/s over terabytes of data.

So it seems that whatever they do hurts in certain cases and severely limits the upside as well.

▲

abbeyj

7 hours ago

[-]

The page is 404 now. It looks like something went wrong when the author was trying to push a small edit to the page. The content is viewable at https://github.com/hiAndrewQuinn/til/blob/main/copy-item-is-...

▲

doormatt

7 hours ago

[-]

Works fine for me.

▲

bakugo

7 hours ago

[-]

Just tried copying a 20GB file to my Windows desktop from a mounted Samba share through gigabit ethernet (nvme on both sides). Explorer, Copy-Item and robocopy all saturated the connection with no issues.

There's definitely something off about OP's setup, though I have no idea what it could be. I'd start by checking the latency between the machines. Might also be the network adapter or its drivers.

▲

ninkendo

7 hours ago

[-]

My first thought would be some kind of "security" software (maybe even as simple as windows defender) inspecting the files as they're coming in, which might be done for any process not on some allow-list. And maybe the allow-list is basically just "explorer.exe". And maybe it's faster at checking some processes than others.

▲

kachapopopow

7 hours ago

[-]

rsync being that much slower makes no sense since back when I used windows I rsync was saturating 1 gig easily, this has to be running on a very slow pentium or something.

▲

zaptheimpaler

8 hours ago

[-]

ugh, I don't know why copying files and basic I/O is so fucked on Windows. Recently I was trying to copy some large movie files between 2 folders on an NVME SSD formatted to ExFAT in a good USB-C enclosure connected over 20Gbps USB-C port and explorer would literally just freeze & crash doing that. I had to copy one file at a time to make it not crash, and then it would have this weird I/O pattern where the transfer would do almost nothing for 1-2 minutes, then the speed eventually picked up.

This isn't even going into WSL. I specifically stopped using WSL and moved to a separate linux devbox because of all the weirdness and slowness with filesystem access across the WSL boundary. Something like listing a lot of files would be very slow IIRC. Slightly tangentially, the whole situation around sharing files across OSes is pretty frustrating. The only one that works without 3rd party paid drivers on all 3 major OSes is ExFAT and that is limited in many other ways compared to ext4 or NTFS.

▲

jeroenhd

6 hours ago

[-]

Explorer freezing halfway through copying happens all the time for me, usually it means Windows' I/O buffer is full and the drive is taking its sweet time actually doing data transfers. Windows will happily show you gigabytes per second being copied to a USB 2.0 drive if your RAM is empty enough, but it'll hang when it tries to flush.

Sometimes it's interference, sometimes the backing SSD is just a lot slower than it says on the box. I've also seen large file transfers (hundreds of gigabytes) expose bad RAM as caches would get filled and cleared over and over again.

You should be able to go into the Windows settings and reduce the drive cache. Copying will be slower, but behaviour will be more predictable.

▲

toast0

7 hours ago

[-]

This feels like usb 3 super speed flakeyness. Did you do all the usual things of trying different ports, moving sources of interferrence, etc? Front ports at super speed are typically the most trouble.

▲

jeroenhd

7 hours ago

[-]

Looking at the source code or copy-item (assuming the author is using a recent version of PowerShell) at https://github.com/PowerShell/PowerShell/blob/master/src/Mic... which calls https://github.com/PowerShell/PowerShell/blob/master/src/Sys..., there seems to be quite a bit of (non-OS) logic that takes place before copying across the network. Copying many small files probably triggers some overhead there.

Then, when the copying happens, this seems to be the code that actually copies the file, at least when copying from remote to local, using the default file system provider: https://github.com/PowerShell/PowerShell/blob/master/src/Sys...

Unless I've taken a wrong turn following the many abstraction layers, this file copy seems to involve connecting to a remote server and exchanging the file contents over a base64 stream (?) using nothing but a standard OutputStream to write the contents.

This means that whatever performance improvements Microsoft may have stuffed into their native network filesystem copy operations doesn't seem to get triggered. The impact will probably differ depending on if you're copying Windows-to-Windows or SAMBA-to-Windows or Windows-to-SAMBA.

I'm no PowerShell guru, but if you can write a (C#?) cmdlet to invoke https://learn.microsoft.com/en-us/windows/win32/api/shellapi... with a properly prepared https://learn.microsoft.com/en-us/windows/win32/api/shellapi... rather than use the native Copy-Item, I expect you'd get the exact same performance you'd get on Windows Explorer.

However, the other measurements do show some rather weird slowdowns for basic filesystem operations over SFTP or WSL2. I think there's more at play there, as I've never seen sftp not reach at least a gigabit given enough time for the window sizes to grow. I think the NAS itself may not be very powerful/powerful enough to support many operations per second, limiting the output for other copy tools.

As an alternative, Windows contains an NFS client that can be tuned to be quite fast, which should have minimal performance overhead on Linux if kernel-NFS is available.

▲

jodrellblank

1 hour ago

[-]

> connecting to a remote server and exchanging the file contents over a base64 stream (?)

I think that's not the right code because it's in "PerformCopyFileFromRemoteSession" and that sounds like it's for Copy-Item -ToSession ... -FromSession ... which are for New-PSSessions (PowerShell remoting/WinRM). Those are already Powershell-serialized-in-XML-in-WinRM-XML (I think) and copying file data plausibly goes into Base64 to go inside that.

That can't be what happens if you do Copy-Item -Destination \\server\share\

▲

pixl97

6 hours ago

[-]

>PowerShell guru, but if you can write a (C#?) cmdlet

Yea, I have a workload that has to delete millions of directories/small files on occasion and we wrote a cmdlet to spawn a huge amount of threads to perform the delete to keep the IOPS saturated and it performs much better than explorer or other deletion methods.

▲

8 hours ago

[-]

> SFTP is an encrypted protocol, so maybe those CPU cycles add up to a lot of extra work over time or slowdown. That… shouldn’t feel convincing to anyone who gives it more than 15 seconds of thought, but we all live with our eyes wide shut at times.

FWIW, I previously spent some time trying to get the maximum possible throughput when copying files between a Windows host and a Linux VM, and the encryption used by most protocols did actually become a bottleneck eventually. I expect this isn't a big factor on 1gbps ethernet, but I've never measured it.

▲

r1ch

7 hours ago

[-]

The bottleneck with SFTP / SCP / SSH is usually the server software - SSH can multiplex streams, so it implements its own TCP-style sliding windows for channel data. Unfortunately OpenSSH and similar server implementations suffer from the exact same problems that TCP did, where the windows don't scale up to modern connection speeds, so the maximum data in-flight quickly gets limited at higher BDPs.

HPN-SSH[1] resolves this but isn't widely deployed.

[1] https://www.psc.edu/hpn-ssh-home/

▲

itsthecourier

7 hours ago

[-]

want to see rsync WSL 1 in that comparison

filesystem should be faster in WSL2 but not if the file resides in the windows path I think