BigFAT – Backward compatible FAT extension for unlimited file size

BigFAT – Backward compatible FAT extension for unlimited file size(segger.com)

196 points by FrankSansC 3 years ago | 106 comments

mort96 3 years ago |

I like the idea. Making it backwards compatible with FAT means that, in principle, regular FAT filesystem implementations could be transparently changed to support big fat files (hehe) transparently.

However, reading the spec, it doesn't look fully backwards compatible? It seems like there are file structures which are possible to represent in FAT which aren't possible to represent in BigFAT. In FAT, I could have a 4GB-128kB size file called "hello.txt", and next to it, a file called "hello.txt.000.BigFAT". A FAT filesystem will show this as intended, but a BigFAT implementation will show it as one file' "hello.txt". That makes this a breaking change.

I would kind of have hoped that they had found an unused but always-zero bit in some header which could be repurposed to identify whether a file has a continuation or not, or some other clever way of ensuring that you can represent all legal FAT32 file structures.

chasil 3 years ago | |

There are so many good filesystems out there. Is it really necessary to keep dragging FAT along?

ReactOS is using btrfs, which has so many useful options that FAT will never see (zstd, xxhash, flash-aware options, snapshots, send/receive, etc.). This is positioned both for Linux and Windows.

Microsoft itself restrains ReFS to enterprise use, and btrfs offers so much more functionality. We should stop using a file system from the '80s.

kreco 3 years ago | | |

Nothing beats the simplicity of FAT.

Btrfs has a lot of bugs while being active for a long time. This is mostly related to its complexity.

If I'm going to implement a filesystem for a custom hardware I would definitely not chose btrfs.

mort96 3 years ago | | |

Camera manufacturers and SD card manufacturers can't start shipping SD cards formatted with btrfs until Windows supports it out of the box. They can start shipping SD cards formatted with FAT32 and software/firmware which reads and writes FAT32+BigFAT.

mschuster91 3 years ago | | |

> Is it really necessary to keep dragging FAT along?

Anything involving embedded and without deep pockets has no other option, FAT (sadly) still is the least common denominator. Some speak ExFAT, but not sure how good the tooling support is outside of Microsoft, and there are still patent concerns.

MisterTea 3 years ago | | |

> ReactOS is using btrfs, which has so many useful options that FAT will never see (zstd, xxhash, flash-aware options, snapshots, send/receive, etc.). This is positioned both for Linux and Windows.

I just want to transfer files on USB sticks without worrying about file size or the OS accessing it. The infuriating part is that it is 2022 and if you want to reliably and easily move files larger than 3-4GB on removable media people tell you to use proprietary MS file systems like ExFAT And NTFS. That is unacceptable.

We NEED a simple, portable, and freely open file system spec for removable media that handles large drives and files.

derefr 3 years ago | | |

FAT is for weak little cost-optimized embedded-microcontroller devices that write one file at a time to an SD card — which is something we're still building to this day, in the form of IoT devices. We don't really have any better option for this use-case; every newer filesystem is either non-portable, or assumes stronger hardware such that the overhead of using it on these devices would be huge.

I would note that one way to work around the cost-incentives of IoT manufacturers, would be to encourage them to externalize the storage-layer costs from the device's BOM, by focusing on getting "object-storage oriented" NAND flash controllers pushed down from enterprise to regular retail availability. That way, all the filesystem-layer smarts end up living in the SD card itself — which is sold separately. (It'd be sort of a second coming of the ancient Commodore 1540/1541 paradigm, where the disk controller presented not as block storage, but as, essentially, a single-user serial-attached NAS.)

selfhoster11 3 years ago | | |

BtrFS is unsafe for production use unless it's coupled with really good backups.

On the other hand, NTFS on Windows, Ext* on Linux, or ZFS on any supported OS, has not been known to eat data as frequently.

michaelbrave 3 years ago | | |

I mostly use fat as a go between for different operating systems, things could be installed to implement similar functionality around another file system, but it's nice to have a default built into everything format that works on every machine. It has its flaws, but the universality of it is a huge strength

NelsonMinar 3 years ago | |

What FAT32 filesystem in the real world has a file named "foo.000.BigFAT" on it?

londons_explore 3 years ago | | |

I can imagine that if bigfat is successful, such files will start to exist.

Imagine someone takes a bigfat drive and puts it in a non-bigfat capable machine, then zips up a directory and publishes it.

When that directory is unzipped on a bigfat machine, should the bigfat files be re-joined, or should they show as separate files? One breaks the OS file API and the unzip program might crash/fail, while the other leads to the application trying to create filenames which can't exist in the filesystem.

mort96 3 years ago | | |

See my response to https://news.ycombinator.com/item?id=32753207. I'm not saying it will break everyone's FAT32 drives, but it is a breaking change in a filesystem, which seems like something kernel people would usually try to avoid.

DannyBee 3 years ago | |

It's as backwards compatible as any other fat extension done so far.

For example, LFN fails if you create too many files with the same first 6 letters :)

I'm actually honestly not sure why representing all legal FAT32 file structures is a particularly useful goal?

FAT in particular, in all of it's forms, has always had limitations and weirdness in filenames, etc.

mort96 3 years ago | | |

I don't understand your LFN example. Which FAT file structure can be represented with LFN disabled that's no longer possible to represent if you add support for LFN?

If BigFAT was actually backwards compatible, it would've been a no-brainer to add support for in filesystem drivers. But since it changes the interpretation of some legitimate structures, adding support for BigFAT is a breaking change. I don't know whether operating systems will want to make breaking changes to their FAT32 filesystems, but it certainly seems like a bigger ask.

saurik 3 years ago | |

I imagine the premise is that if you mount this disk in an implementation that doesn't understand these structures it works and you don't corrupt it, making the format backwards compatible with old implementations. This is similar to the trick used to add long filenames: putting a special 8.3 file with a ~ that includes the full file name.

kmeisthax 3 years ago |

>Unfortunately, exFAT has been adopted by the SD Association as the default file system for SDXC cards larger than 32 GB. In our view, this should never have happened, as it forces anyone who wants to access SDXC cards to get a license from Microsoft, basically making this a field owned by Microsoft.

So, this is a bit of a cultural/perception gap between FOSS developers and standards bodies. Most standards bodies have a patent policy of "as long as all the standards-essential patents are licensable for a uniform fee, we're good". Convincing patent holders to not extract royalties from their patents for the sake of easing the lives of FOSS implementers is much, much harder[0].

Microsoft isn't even the only SEP holder for SD, and the standard makes no attempt at being a royalty-free standard. In fact, early SD standards were NDA'd[1] and prohibited FOSS implementation at all.

[0] In fact, so hard that the EU has a conspiracy theory that Google/AOM bullied a patent holder into doing this

[1] Remember, SD cards were basically MMC with primitive DRM

CodesInChaos 3 years ago |

Are the exFAT patents still a problem nowadays?

> exFAT was a proprietary file system until 2019, when Microsoft released the specification and allowed OIN members to use their patents.

https://en.wikipedia.org/wiki/ExFAT#Legal_status

slavik81 3 years ago | |

The patent will also expire in 2027 [1]. We can look forward to it being entirely unencumbered at that point.

https://patents.google.com/patent/US20090164440?oq=US2009164...

ksec 3 years ago | | |

I sometimes wonder if companies could choose to expire their patents earlier. Especially in cases when there are little to no strategic value to uphold them, but lots of potential value to unlock when they are gone.

NotYourLawyer 3 years ago | |

> We also support the eventual inclusion of a Linux kernel with exFAT support in a future revision of the Open Invention Network’s Linux System Definition, where, once accepted, the code will benefit from the defensive patent commitments of OIN’s 3040+ members and licensees.

I don’t know exactly what that means. But it sounds like something different from “we hereby grant everybody a license to any and all exFAT patents.”

loeg 3 years ago | |

Good for OIN, but it doesn't help non-Linux systems.

mikece 3 years ago |

> Why not exFAT... Microsoft owns several patents, and anyone who implements or uses exFAT technology needs Microsoft's permission, which typically also includes paying fees to Microsoft.

While BigFAT not being encumbered by any patents is a good thing, the camera industry have pretty much standardized on exFAT for their removable file storage format. Something I'm curious about is how a 5GB video file (quite common and actually on the smaller size for 4K and 8K recording sessions) is written and accessed between the two file systems. BigFAT says that the file would be written in 4GB chunks; is there something similar happening with exFAT or is the file "one chunk?" (Apologies if I have the terms wrong -- I'm not a filesystem expert.) The author laments that the exFAT format has been adopted for SDXC cards but given who all is in this group and what their use cases are I can discount "because Microsoft strong-armed them" as a reason for them selecting it.

cmurf 3 years ago | |

The industry could have used UDF. Derived from ISO 9660, but it supports read-write random access storage.

I'm guessing they didn't if FAT12/16/32->exFAT driver changes are comparatively simple, and/or results in a smaller code base to support FAT32 and exFAT on the same device (e.g. a camera).

mikece 3 years ago | | |

And on a camera that costs anywhere from USD$1000 to USD$6500 does the cost of an exFAT license really matter?

lathiat 3 years ago | |

ExFAT is not limited to a 4GB maximum file size. It just has more than 4GB in the file.

I guess 4GB seemed like a reasonable limit when FAT32 was designed.

Most likely FAT32 has a 32bit number for file size and ExFAT presumably has either a 64bit one or stores file size in some format other than bytes.

creshal 3 years ago | | |

> I guess 4GB seemed like a reasonable limit when FAT32 was designed.

FAT32 was always seen as stop-gap measure for low-end consumer hardware when introduced in 1996; NTFS was introduced 3 years prior to handle terabyte-scale data for enterprise users.

> Most likely FAT32 has a 32bit number for file size and ExFAT presumably has either a 64bit one

Correct.

zinekeller 3 years ago | |

I actually am disappointed that Microsoft has a chance to fix some inherent problems with FAT but didn't, even considering the main use case of a simple FS. Notably, it still has the notorious year 2100 bug (or 2108 bug, depending on the implementation), the metadata is weird and not at all straightforward, it's basically just extending FAT32 and some minor updates since Unicode and UTC are now here.

phkahler 3 years ago |

The question I have is, why Segger? When I saw this I was like "the debugger company?!?!" Clearly this wouldn't fall under their business, so it makes sense for them to open it up, but why did they do it in the first place?

weinzierl 3 years ago | |

They offer their own file system implementation (emFile) which supports either their own storage format (EFS) or FAT. The BigFAT article is posted in the emFile section of their website.

My suspicion is that customers are bugging them to support large files in emFile and they don't want to pay the license fee for exFAT. I think they even can't do that with their current licensing model, which is one-time per product (not item) or product-family payment.

EDIT: I tried to find out if Microsoft's exFAT is licensed per product or per unit and I found that it used to be a 300000 USD flat fee in 2009 but seems to be free since 2019. So my theory from above has no basis and I wonder why Segger does not simply implement exFAT?

noAnswer 3 years ago | | |

It is not true that is "free since 2019".

Source:

https://www.microsoft.com/en-us/legal/intellectualproperty/t...

https://www.paragon-software.com/exfat-license/

https://en.wikipedia.org/wiki/ExFAT#Legal_status

You maybe of the hook if you use Linux >= 5.7. And it seem that you are of the hook if you are a member of the Open Invention Network (OIN).

But SEGGER's embOS is not based on Linux and their costumers a OEMs themselves. So their costumers would need to be OIN members or pay royalties to MS.

matja 3 years ago | |

They have an entire RTOS ecosystem which supports a gazillon different microcontrollers.

ninefathom 3 years ago |

I'm a bit puzzled as to how split files with name standardization is an "extension." It seems to me that SEGGER is simply proposing a de facto file naming convention, and offering a few free tools (including a few abstraction drivers) to encourage adoption.

Can somebody fill me in, here- where's the value in what SEGGER is proposing, as opposed to what the entire IT community has already been doing for decades?

mort96 3 years ago | |

Well, if we view FAT32 + this name convention as a new filesystem, then filesystem drivers could let you transparently operate on files bigger than 4GB (GiB?) and take care of the splitting for you. FAT32 + this convention would essentially become a filesystem which supports files up to around 4TB. You wouldn't have to make the choice between the patent-encumbered exFAT and the open but limited FAT32.

sampa 3 years ago |

If only they released it back when exFAT was released. Now it has no future.

bArray 3 years ago |

Is this only compatible with FAT32, or is it also compatible with FAT12/16? It would be very cool if this would support floppy disks.

Regarding the format, once you convert it, does the target device need to have a driver to support the format? It mentions that this would allow for > 4GB files for TVs, but these are typically non-updated very out of date OSes.

I think MS missed a trick by not making the boot sector also contain a simplistic driver, although it would have been a push to keep it all down at 512 bytes.

Dylan16807 3 years ago | |

> Is this only compatible with FAT32, or is it also compatible with FAT12/16? It would be very cool if this would support floppy disks.

It's simple enough to work on basically anything, but for what purpose?

The max file size on FAT12/16 is the same as the max drive size.

And FAT32 is very easy to implement for any system dealing with multiple megabytes.

bArray 3 years ago | | |

As I replied to the comment, it's about having a filesystem that scales from floppy disks to hard drives. This is quite important for hobby OSes.

pantalaimon 3 years ago | |

How do you get files > 4 GiB on a floppy disk?

bArray 3 years ago | | |

You don't of course, it's about having a filesystem that works from floppy disk all the way to hard drive.

JAA1337 3 years ago |

Awesome concept, especially for academia ... but is there a value proposition?

I love seeing this, don't get me wrong. I am just curious is there are any real world applications for this?

mort96 3 years ago | |

It would be great to have a non-patent-encumbered simple file format that's supported everywhere. The fact that this is based on FAT32 might help adoption, everyone's computers can already at least read a BigFAT drive, and BigFAT support could be added at the application level for systems which don't support it at an OS level.

prmoustache 3 years ago | | |

Are the filesystem used on bsd and linux distros patent encumbered? Isn't UFS2 simple enough?

noAnswer 3 years ago | |

> but is there a value proposition?

Straight from their FAQ: We see emFile customers asking for solutions for bigger files. Implementing exFAT is not an option for us, as it is patent encumbered. SEGGER would need Microsoft's permission to implement and offer it, and our customers need to deal with Microsoft again to be able to use it in their products. This can be time-consuming and also expensive. We feel there should be a free alternative. The more popular BigFAT becomes, the better.

I guess using anything but FAT would make it hard for their developer base.

Bakary 3 years ago | |

I already use FAT32 for some USB sticks when I need to be able to use them on various OSes without having to give it any thought, or for long term archiving.

This would be extremely niche but would have its audience. Heck, I wouldn't be surprised if HN readers adopted it just to taste the thrill of unlimited and obscene power.

scohesc 3 years ago |

Would it not be possible to create a filesystem with modern capabilities but with backwards compatibility with FAT? Why can't we just have "legacy" commands built into the ReFS filesystem that process any FAT filesystem access?

I'm very ignorant to this but I'd love some insight from someone vastly more knowledgeable than me.

tenebrisalietum 3 years ago | |

A filesystem translates a filename-based streaming I/O API to the way a disk talks, which is "Read/write 512 times X bytes of data at disk block N."

The I/O API or "commands" are the same; different filesystems will implement it differently.

scohesc 3 years ago | | |

Thanks!

I wasn't aware of the intricacies of this stuff - I'll have to do some more reading in my free time.

steeleduncan 3 years ago |

Is there a linux kernel driver for this somewhere?

stuaxo 3 years ago |

If they want it to spread they should also write a fuse implementation and think about operating system support for Linux or BSD.

tumetab1 3 years ago |

The thing missing on the page is some kind of performance benchmark which I would love to read/see.

quickthrower2 3 years ago |

Is the big file handling seemless? If not why not just split files and use regular FAT32.

And what about converting FAT32 to a linux partition? Or buy a new disk and move data over to that.

Edit: it is a genuine question. downvote implies not but honestly it is.

tekchip 3 years ago |

How is this not BackFAT?

tzahifadida 3 years ago |

Looks good. Keep at it!