However, reading the spec, it doesn't look fully backwards compatible? It seems like there are file structures which are possible to represent in FAT which aren't possible to represent in BigFAT. In FAT, I could have a 4GB-128kB size file called "hello.txt", and next to it, a file called "hello.txt.000.BigFAT". A FAT filesystem will show this as intended, but a BigFAT implementation will show it as one file' "hello.txt". That makes this a breaking change.
I would kind of have hoped that they had found an unused but always-zero bit in some header which could be repurposed to identify whether a file has a continuation or not, or some other clever way of ensuring that you can represent all legal FAT32 file structures.
ReactOS is using btrfs, which has so many useful options that FAT will never see (zstd, xxhash, flash-aware options, snapshots, send/receive, etc.). This is positioned both for Linux and Windows.
Microsoft itself restrains ReFS to enterprise use, and btrfs offers so much more functionality. We should stop using a file system from the '80s.
Btrfs has a lot of bugs while being active for a long time. This is mostly related to its complexity.
If I'm going to implement a filesystem for a custom hardware I would definitely not chose btrfs.
Anything involving embedded and without deep pockets has no other option, FAT (sadly) still is the least common denominator. Some speak ExFAT, but not sure how good the tooling support is outside of Microsoft, and there are still patent concerns.
I just want to transfer files on USB sticks without worrying about file size or the OS accessing it. The infuriating part is that it is 2022 and if you want to reliably and easily move files larger than 3-4GB on removable media people tell you to use proprietary MS file systems like ExFAT And NTFS. That is unacceptable.
We NEED a simple, portable, and freely open file system spec for removable media that handles large drives and files.
I would note that one way to work around the cost-incentives of IoT manufacturers, would be to encourage them to externalize the storage-layer costs from the device's BOM, by focusing on getting "object-storage oriented" NAND flash controllers pushed down from enterprise to regular retail availability. That way, all the filesystem-layer smarts end up living in the SD card itself — which is sold separately. (It'd be sort of a second coming of the ancient Commodore 1540/1541 paradigm, where the disk controller presented not as block storage, but as, essentially, a single-user serial-attached NAS.)
On the other hand, NTFS on Windows, Ext* on Linux, or ZFS on any supported OS, has not been known to eat data as frequently.
Imagine someone takes a bigfat drive and puts it in a non-bigfat capable machine, then zips up a directory and publishes it.
When that directory is unzipped on a bigfat machine, should the bigfat files be re-joined, or should they show as separate files? One breaks the OS file API and the unzip program might crash/fail, while the other leads to the application trying to create filenames which can't exist in the filesystem.
For example, LFN fails if you create too many files with the same first 6 letters :)
I'm actually honestly not sure why representing all legal FAT32 file structures is a particularly useful goal?
FAT in particular, in all of it's forms, has always had limitations and weirdness in filenames, etc.
If BigFAT was actually backwards compatible, it would've been a no-brainer to add support for in filesystem drivers. But since it changes the interpretation of some legitimate structures, adding support for BigFAT is a breaking change. I don't know whether operating systems will want to make breaking changes to their FAT32 filesystems, but it certainly seems like a bigger ask.
So, this is a bit of a cultural/perception gap between FOSS developers and standards bodies. Most standards bodies have a patent policy of "as long as all the standards-essential patents are licensable for a uniform fee, we're good". Convincing patent holders to not extract royalties from their patents for the sake of easing the lives of FOSS implementers is much, much harder[0].
Microsoft isn't even the only SEP holder for SD, and the standard makes no attempt at being a royalty-free standard. In fact, early SD standards were NDA'd[1] and prohibited FOSS implementation at all.
[0] In fact, so hard that the EU has a conspiracy theory that Google/AOM bullied a patent holder into doing this
[1] Remember, SD cards were basically MMC with primitive DRM
> exFAT was a proprietary file system until 2019, when Microsoft released the specification and allowed OIN members to use their patents.
https://patents.google.com/patent/US20090164440?oq=US2009164...
I don’t know exactly what that means. But it sounds like something different from “we hereby grant everybody a license to any and all exFAT patents.”
While BigFAT not being encumbered by any patents is a good thing, the camera industry have pretty much standardized on exFAT for their removable file storage format. Something I'm curious about is how a 5GB video file (quite common and actually on the smaller size for 4K and 8K recording sessions) is written and accessed between the two file systems. BigFAT says that the file would be written in 4GB chunks; is there something similar happening with exFAT or is the file "one chunk?" (Apologies if I have the terms wrong -- I'm not a filesystem expert.) The author laments that the exFAT format has been adopted for SDXC cards but given who all is in this group and what their use cases are I can discount "because Microsoft strong-armed them" as a reason for them selecting it.
I'm guessing they didn't if FAT12/16/32->exFAT driver changes are comparatively simple, and/or results in a smaller code base to support FAT32 and exFAT on the same device (e.g. a camera).
I guess 4GB seemed like a reasonable limit when FAT32 was designed.
Most likely FAT32 has a 32bit number for file size and ExFAT presumably has either a 64bit one or stores file size in some format other than bytes.
FAT32 was always seen as stop-gap measure for low-end consumer hardware when introduced in 1996; NTFS was introduced 3 years prior to handle terabyte-scale data for enterprise users.
> Most likely FAT32 has a 32bit number for file size and ExFAT presumably has either a 64bit one
Correct.
My suspicion is that customers are bugging them to support large files in emFile and they don't want to pay the license fee for exFAT. I think they even can't do that with their current licensing model, which is one-time per product (not item) or product-family payment.
EDIT: I tried to find out if Microsoft's exFAT is licensed per product or per unit and I found that it used to be a 300000 USD flat fee in 2009 but seems to be free since 2019. So my theory from above has no basis and I wonder why Segger does not simply implement exFAT?
Source:
https://www.microsoft.com/en-us/legal/intellectualproperty/t...
https://www.paragon-software.com/exfat-license/
https://en.wikipedia.org/wiki/ExFAT#Legal_status
You maybe of the hook if you use Linux >= 5.7. And it seem that you are of the hook if you are a member of the Open Invention Network (OIN).
But SEGGER's embOS is not based on Linux and their costumers a OEMs themselves. So their costumers would need to be OIN members or pay royalties to MS.
Can somebody fill me in, here- where's the value in what SEGGER is proposing, as opposed to what the entire IT community has already been doing for decades?
Regarding the format, once you convert it, does the target device need to have a driver to support the format? It mentions that this would allow for > 4GB files for TVs, but these are typically non-updated very out of date OSes.
I think MS missed a trick by not making the boot sector also contain a simplistic driver, although it would have been a push to keep it all down at 512 bytes.
It's simple enough to work on basically anything, but for what purpose?
The max file size on FAT12/16 is the same as the max drive size.
And FAT32 is very easy to implement for any system dealing with multiple megabytes.
I love seeing this, don't get me wrong. I am just curious is there are any real world applications for this?
Straight from their FAQ: We see emFile customers asking for solutions for bigger files. Implementing exFAT is not an option for us, as it is patent encumbered. SEGGER would need Microsoft's permission to implement and offer it, and our customers need to deal with Microsoft again to be able to use it in their products. This can be time-consuming and also expensive. We feel there should be a free alternative. The more popular BigFAT becomes, the better.
I guess using anything but FAT would make it hard for their developer base.
This would be extremely niche but would have its audience. Heck, I wouldn't be surprised if HN readers adopted it just to taste the thrill of unlimited and obscene power.
I'm very ignorant to this but I'd love some insight from someone vastly more knowledgeable than me.
The I/O API or "commands" are the same; different filesystems will implement it differently.
I wasn't aware of the intricacies of this stuff - I'll have to do some more reading in my free time.
And what about converting FAT32 to a linux partition? Or buy a new disk and move data over to that.
Edit: it is a genuine question. downvote implies not but honestly it is.
It sounds like BigFat is an extension that takes away the need to do this at the application level. The code does all the splitting and merging for you so you can write a program that acts like the file is on a file system that supports files > 4GB.
If I may it would make more sense (to me at least) to use a directory and have a descriptor file, not entirely unlike multi-part vmdk's are implemented.
> de facto file naming convention
From a first look, like it's using Microsoft's own hack of long file names[1] to create file entries that look like they belong to 1 file. A file that has a long file name (more than the 8+3 character limit) is actually several file entries, but they're empty files. Seems like the tool is creating non-empty files instead, that Windows is chaining together as one.
[1] https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#...
e.g. some widget without network connection that optionally logs an audit trail to an attached USB media, could still end up with only a couple hundred kilobytes of soldered-on ROM to store the whole firmware, while wanting to write more than 4GB of audit logs.
MICROS~1
(Long name, Microsoft Office.)
So if you create files called MICROS~2 - MICROS~0 in theory you can create enough abbreviated names that there are not available short names for long filenames you wish to create. Every LFN must have a "real" 8.3 counterpart.
They're only "rejoined" by the BigFAT compatible filesystem driver on access. By running such a driver, you're agreeing that such files should "appear" as one.
By all means, choose the Microsoft solution, because patent licensing is good for everyone.
And the bug myth past into history years ago.
"So, we'll repeat this once more: as a single-disk filesystem, btrfs has been stable and for the most part performant for years."
https://arstechnica.com/gadgets/2021/09/examining-btrfs-linu...
FAT is still popular because it is very easy to implement, and anything can read and write to it. It is pretty easy to implement the file system on a low power microcontroller and have it write data to an SD card. Your users can then plug that SD card into any computer and view the data, or add to it.
Using btrfs in a situation like this means a lot more coding on your end, and your users lose the convenience of the SD card using a file system they can easily interact with.
Nobody is using FAT for their primary system partition. It is almost exclusively relegated to embedded systems and small external storage devices where broad compatibility is an important feature.
We are all forced to pay for this ancient software every time we buy a device that uses it.
Wouldn't this money be better used elsewhere?
https://en.m.wikipedia.org/wiki/File_Allocation_Table#Patent...
Or the sheer ubiquity, and therefore cross-device compatibility.
If you manufacture 100K of them, and save 10 dollars on every piece, you got an extra million in savings.
Yes, the manufacturers will go great lengths to minimize variable costs. If they can shave $0.25, they will. At volumes, it matters.
[1] https://en.m.wikipedia.org/wiki/Universal_Disk_Format
[2] https://duncanlock.net/blog/2013/05/13/using-udf-as-an-impro...
On the contrary (but it is a specific "niche" case) the mentioned vmdk split format allows to mount the vmdk same as monolithic, with full random access.
The actual implementation (by SEGGER or by someone else):
>Q: Can I implement BigFAT myself?
>A: Absolutely. BigFAT is a specification made available by SEGGER. Anybody is free to write a piece of software implementing it. No fees, no royalties, no headaches. You do not even have to let SEGGER or anybody else know.
is what may allow that (random access), this implementation would be useful if - instead of a "feature" of a given app/program - it would be implemented as a filesystem driver of sorts.
According to them (https://btrfs.wiki.kernel.org/index.php/Status) only RAID56 is unstable.
[1] https://kb.synology.com/en-global/DSM/tutorial/How_can_I_rec...
"So, we'll repeat this once more: as a single-disk filesystem, btrfs has been stable and for the most part performant for years."
https://arstechnica.com/gadgets/2021/09/examining-btrfs-linu...
x86_64 with SSE2 is also patent-free right now, as an example.
The shared set there is basically just fat and exfat.
If Microsoft and Apple collaborated on a new filesystem, or even just supported it, then we might have a possible successor. However even with that, the millions of already shipped devices won't support it. This during the transition period of many years there will still need to be support for fat.
That keeps fat the lowest common denominator and everything supporting it.
The only way to get a universal standard is to have the community do it and have enough people use it that the big companies have to capitulate.
But as you pointed out, in a transitionary period there is still a need to support older devices and software. FOSS purists may also not approve of using exFAT in some situations, since the relevant patents have not yet expired, even if MS has released them to the OIN.
3rd parties can write drivers for Windows, you know. A small, read-only FAT partition on a USB stick or SD card could contain the installable drivers necessary to read/write the rest of the disk.
However, that's unnecessary. The best option for a universal file system is UDF. Windows, Mac, and Linux all have full read/write support.
Still, something similar to fuse might help with the licensing.
Do you know what happens when you insert a btrfs-formatted SD card or USB stick into a Windows or macOS machine? It tells you that the drive is unreadable and asks if you want to initialize it. If the user answers yes to that question, the system formats the drive and all of their data is lost.
With a BigFAT-formatted drive, the system will mount it no problem, the user will be able to browse the contents, and the only weird part is that their largest files are split into parts.
2. Switching to BTRFS would be a breaking change. BigFAT wouldn't be. You can still use the card in devices that do not support it, without needing to reformat. Those devices would just lose access to some files.
I've had a hunt for the specific instructions but I'm afraid I can't find it again with a search. The gist of it was to:
- create Btrfs mirror using two qemu virtual disks
- pull the (virtual) plug on one of the pair to disconnect it, then later reconnect it
- Btrfs ends up hosing both the outdated and current copies of the mirror, leading to complete dataloss of the entire mirror