File system differences - which one is better? ReFS - the file system of the future? Windows 10 which file system is used

Microsoft's new ReFS file system originally appeared on servers running Windows 2012. It was only later included in Windows 10, where it can only be used as part of the Storage Spaces feature of a disk pool. In Windows Server 2016, Microsoft promises to significantly improve the work with the ReFS file system, in addition, according to rumors in print, ReFS may come to replace the outdated NTFS file system in the new version of Windows 10, which is proudly called Windows 10 Pro (for advanced PCs ).

But what exactly is ReFs, how does it differ from the currently used file system NTFS and what advantages does it have?

What is ReFS

In short, it was designed as a fault-tolerant file system. ReFS is a new file system built with code and is essentially a redesigned and improved NTFS file system. These include improved reliability of information storage, stable operation in stress modes, sizes of files, volumes, directories, the number of files in volumes and directories is limited only by the size of 64-bit numbers. Recall that the maximum for this value, the maximum file size will be 16 exbibytes, and the volume size will be 1 yobibyte.

Currently, ReFS is not a replacement for NTFS. It has advantages and disadvantages. But you cannot, say, format the disk and install a fresh copy of Windows on it as you would on NTFS.

ReFS protects your data

ReFS uses checksums for metadata and can also use checksums for data files. Every time you read or write files, ReFS checks the checksum to make sure it is correct. This means that the file system itself has a tool capable of detecting corrupted data on the fly.

ReFS is integrated with Storage Spaces. If you've configured ReFS-enabled mirroring, Windows can easily detect file system corruption and automatically repair it by copying the mirrored data to the damaged disk. This feature is available for both Windows 10 and Windows 8.1.

If ReFS detects corrupted data and the required copy of the data is not available for recovery, the file system is able to immediately remove the corrupted data from the disk. This does not require a system reboot, unlike NTFS.

ReFS does more than just check the integrity of files as they are read. It automatically scans data integrity by regularly checking all files on the disk, identifying and fixing corrupted data. In this case, there is no need to periodically run the chkdsk command to check the disk.

The new file system is also resistant to data corruption in other ways. For example, you update the metadata of a file (albeit a file name). NTFS file system directly modify the file metadata. If a system crash (power off) occurs at this time, it is highly likely that the file will be damaged. When you change the metadata, the ReFS file system creates a new copy of the metadata. The file system does not overwrite the old metadata, but writes it to a new block. This eliminates the possibility of damage to the file. This strategy is called “copy-on-write”. This strategy is available on other modern file systems such as ZFS and BtrFS on Linux, as well as Apple's new APFS file system.

Limitations of the NTFS file system

ReFS is more modern than NTFS and supports much larger amounts of data and longer filenames. This is very important in the long run.

In NTFS file system, the file path is limited to 255 characters. In ReFS, the maximum number of characters is already an impressive 32,768 characters. Currently in Windows 10 there is an option to disable the character element for NTFS. This limit is disabled by default on ReFS disk volumes.

ReFS does not support DOS 8.3 filenames. On NTFS volumes, you have access to the folders “CProgram Files”, “CProgra`1”. They are needed for compatibility with the old software... In ReFS, you won't find the folders we are used to. They are removed.

The theoretical maximum amount of data supported by NTFS is 16 exabytes, ReFS supports up to 262,144 exabytes. Now this figure seems to be enormous.

ReFS performance

The developers did not aim to create a more efficient file system. They made a more streamlined system.

For example, when used with an array, ReFS supports real-time level optimization. You have a two-disk drive pool assembled. The first disc is selected for high speed, fast access to the data. The second disk is selected with a reliability criterion for long-term data storage. In the background, ReFS will automatically move large chunks of data to a slower disk, thereby ensuring that the data is saved reliably.

In Windows Server 2016, the developers added a tool that provides performance enhancements using specific features of virtual machines. For example, ReFS supports copying blocks, which speeds up the process of copying virtual machines and merging checkpoints. To create a copy of a virtual machine, ReFS creates a new copy of the metadata on disk and provides a link to the copied data on disk. This is so that multiple files can reference the same underlying data on disk using ReFS. After you, having worked with the virtual machine, change the data, they are written to the disk in a different location, and the original data of the virtual machine remains on the disk. This greatly speeds up the process of creating copies and reduces the load on the disk.

ReFS supports “Sparse VDL” (sparse files). A sparse file is a file in which a sequence of zero bytes is replaced with information about that sequence (a list of holes). Holes are a specific sequence of zero bytes inside a file, not written to disk. The hole information itself is stored in the file system metadata.

Sparse file support technology allows you to quickly write zeros to a large file. This greatly speeds up the process of creating a new, empty, fixed-size virtual hard disk (VHD) file. The creation of such a file in ReFS takes a few seconds, while in NTFS it takes up to 10 minutes.

Still, ReFS is unable to completely replace NTFS.

Everything we described above sounds good, but you won't be able to switch to ReFS from NTFS. Windows cannot boot from ReFS, requiring NTFS.

ReFS lacks many of the technologies available in NTFS. For example, file system compression and encryption, hard links, extended attributes, data deduplication, and disk quotas. At the same time, unlike NTFS, ReFS supports full data encryption technology - BitLocker.

In Windows 10, you will not be able to format a disk partition with ReFS. The new file system is available only for storage systems, where its main function is to protect data from damage. In Windows Server 2016, you will be able to format the disk partition with ReFS. You will be able to use it to run virtual machines. But you won't be able to select it as a boot disk. Windows boots only from the NTFS file system.

It's unclear what the future holds for Microsoft for the new file system. Perhaps one day it will completely replace NTFS in all versions of Windows. But at the moment, ReFS can only be used for certain tasks.

Using ReFS

Much has been said above in support of the new operating system. Cons and pros are described. I propose to stop and summarize. For what purposes it is possible, and maybe it is necessary to use ReFS.

On Windows 10, ReFS is only applicable in conjunction with the Storage Spaces component. Be sure to format your storage drive with ReFS and not NTFS. In this case, you will be able to fully appreciate the reliability of data storage.

On Windows Server, you will be able to format the partition for ReFS using the standard Windows tool in the Disk Management Console. It is recommended to be sure to format for ReFS if you are using virtual servers. But remember that the boot disk must be NTFS formatted. Booting from the ReFS file system is not supported on Windows operating systems.

New file system ReFS and Windows 10| 2017-06-28 06:34:15 | Super User | System Software | https: //site/media/system/images/new.png | The new file system from Microsoft ReFS has come to replace the outdated NTFS. What are the advantages of ReFS and how it differs from NTFS | refs, refs or ntfs, refs windows 10, refs filesystem, new filesystems, ntfs system, ntfs filesystem

Why may a smartphone not launch programs from a memory card? How is ext4 fundamentally different from ext3? Why will a flash drive live longer if formatted to NTFS rather than FAT? What is the main problem with F2FS? The answers lie in the peculiarities of the structure of file systems. We will talk about them.

Introduction

File systems determine how data is stored. They determine what restrictions the user will face, how fast read and write operations will be, and how long the drive will work without failures. This is especially true of budget SSDs and their younger brothers - flash drives. Knowing these features, you can squeeze the maximum out of any system and optimize its use for specific tasks.

You have to choose the type and parameters of the file system every time you need to do something non-trivial. For example, you want to speed up the most frequent file operations. At the file system level, this can be accomplished in a number of ways: indexing will provide fast searches, and pre-reservation of free blocks will make it easier to overwrite frequently changing files. Pre-optimization of data in random access memory will reduce the number of required I / O operations.

Features of modern file systems such as lazy writing, deduplication, and other advanced algorithms help extend uptime. They are especially relevant for cheap SSDs with TLC memory chips, flash drives and memory cards.

Separate optimizations exist for different tiers of disk arrays: for example, the file system can support lightweight volume mirroring, snapshots, or dynamic scaling without taking a volume offline.

Black box

Users mainly work with the file system offered by the operating system by default. They rarely create new disk partitions and even less often think about their settings - they just use the recommended parameters or even buy pre-formatted media.

For Windows fans, everything is simple: NTFS on all disk partitions and FAT32 (or the same NTFS) on flash drives. If there is a NAS and some other file system is used in it, then for the majority this remains beyond perception. They simply connect to it over the network and download files, as if from a black box.

On mobile gadgets with Android, ext4 is most often found in internal memory and FAT32 on microSD cards. For Apple, it doesn't matter at all what kind of file system they have: HFS +, HFSX, APFS, WTFS ... for them there are only beautiful folder and file icons drawn by the best designers. Linux users have the richest choice, but you can add support for file systems that are not native to the operating system in both Windows and macOS - more on that later.

Common roots

Over a hundred different file systems have been created, but a little more than a dozen can be called relevant. While they were all designed for their specific applications, many ended up being conceptually related. They are similar because they use the same type of presentation structure (meta) data - B-trees ("bi-trees").

As with any hierarchical system, the B-tree starts at the root record and further branches down to the final elements - individual records about files and their attributes, or "leaves". The main purpose of creating such a logical structure was to speed up the search for file system objects on large dynamic arrays - like hard drives of several terabytes or even more impressive RAID arrays.

B-trees require far fewer disk accesses than other types of B-trees when performing the same operations. This is achieved due to the fact that the final objects in B-trees are hierarchically located at the same height, and the speed of all operations is just proportional to the height of the tree.

Like other balanced trees, B-trees have the same path length from root to any leaf. Instead of growing up, they branch more and grow more in width: all branch points in the B-tree store many references to child objects, making them easy to find in fewer calls. A large number of pointers reduces the number of the longest disk operations - head positioning when reading arbitrary blocks.

The concept of B-trees was formulated back in the seventies and has undergone various improvements since then. It is implemented in one form or another in NTFS, BFS, XFS, JFS, ReiserFS and many DBMS. They are all cousins in terms of the basic principles of data organization. The differences concern details, which are often quite important. The disadvantage of related file systems is also common: they were all created to work with disks even before the advent of SSDs.

Flash memory as an engine of progress

Solid-state drives are gradually replacing disk drives, but so far they are forced to use file systems that are alien to them, inherited. They are built on arrays of flash memory, the principles of which differ from those of disk devices. In particular, flash memory must be erased before writing, and this operation in NAND chips cannot be performed at the level of individual cells. It is possible only for large blocks as a whole.

This limitation is due to the fact that in NAND memory, all cells are combined into blocks, each of which has only one common connection to the control bus. We will not go into the details of the paging organization and paint the full hierarchy. The very principle of group operations with cells and the fact that the sizes of blocks of flash memory are usually larger than the blocks addressed in any file system are important. Therefore, all addresses and commands for drives with NAND flash must be translated through the FTL (Flash Translation Layer) abstraction layer.

Flash memory controllers provide compatibility with the logic of disk devices and support for commands from their native interfaces. Usually FTL is implemented in their firmware, but it can (partially) run on the host - for example, Plextor writes drivers for its SSDs that accelerate writing.

You cannot do without FTL at all, since even writing one bit to a specific cell leads to the launch of a whole series of operations: the controller searches for a block containing the required cell; the block is read in full, written to the cache or to free space, then erased entirely, after which it is rewritten back with the necessary changes.

This approach resembles everyday army life: in order to give an order to one soldier, the sergeant makes a general formation, calls the poor fellow out of order and orders the rest to disperse. In the now rare NOR memory, the organization was spetsnaz: each cell was controlled independently (each transistor had an individual contact).

The controllers have more and more tasks, because with each generation of flash memory, the technical process of its manufacture decreases in order to increase the density and reduce the cost of data storage. Along with technological standards, the estimated life of the chips is also reduced.

Modules with single-level SLC cells had a declared resource of 100 thousand rewrite cycles and even more. Many of them still work in old flash drives and CF cards. The enterprise-class MLC (eMLC) claimed the resource in the range from 10 to 20 thousand, while in the usual consumer-grade MLC it is estimated at 3-5 thousand. Memory of this type is being actively pressed by the even cheaper TLC, whose resource barely reaches a thousand cycles. Keeping the lifespan of flash memory at an acceptable level has to be done through software tweaks, and new file systems are becoming one of them.

Initially, manufacturers assumed that the file system was unimportant. The controller itself must maintain a short-lived array of memory cells of any type, distributing the load between them in an optimal way. For the file system driver, it simulates a regular disk, and itself performs low-level optimizations on any access. However, in practice, optimization varies from magical to fictitious for different devices.

In corporate SSDs, the built-in controller is a small computer. It has a huge memory buffer (half a gig and more), and it supports many methods to improve the efficiency of working with data, which avoids unnecessary rewriting cycles. The chip arranges all the blocks in the cache, performs lazy writes, performs deduplication on the fly, reserves some blocks and clears others in the background. All this magic happens completely unnoticed by the OS, programs and the user. With an SSD like this, it really doesn't matter what filesystem is used. Internal optimizations have a much larger impact on performance and resource than external ones.

Budget SSDs (and even more so - flash drives) are equipped with much less intelligent controllers. The cache in them is truncated or absent, and advanced server technologies are not used at all. In memory cards, the controllers are so primitive that it is often claimed that they do not exist at all. Therefore, for cheap devices with flash memory, external load balancing methods remain relevant - primarily using specialized file systems.

JFFS to F2FS

One of the first attempts to write a file system that would take into account the principles of organization of flash memory was JFFS - Journaling Flash File System. Initially, this development by the Swedish company Axis Communications was focused on improving the memory efficiency of network devices that Axis produced in the nineties. The first version of JFFS only supported NOR memory, but already in the second version it became friends with NAND.

JFFS2 is of limited use right now. Mostly it is still used in Linux distributions for embedded systems. It can be found in routers, IP cameras, NAS, and other regulars on the Internet of Things. In general, wherever a small amount of reliable memory is required.

A further development effort for JFFS2 was LogFS, which stored inodes in a separate file. The authors of this idea are an employee of the German division of IBM Jorn Engel and a professor at the University of Osnabrück Robert Mertens. The source code for LogFS is available on GitHub. Judging by the fact that the last change in it was made four years ago, LogFS has not gained popularity.

But these attempts spurred the emergence of another specialized file system - F2FS. It was developed by Samsung Corporation, which accounts for a large part of the flash memory produced in the world. Samsung makes NAND Flash chips for its own devices and for other companies, and also develops SSDs with fundamentally new interfaces instead of legacy disk ones. The creation of a specialized file system optimized for flash memory has been a long overdue necessity from Samsung's point of view.

Four years ago, in 2012, Samsung created F2FS (Flash Friendly File System). Its idea is good, but the implementation turned out to be damp. The key task when creating F2FS was simple: to reduce the number of cell rewriting operations and distribute the load on them as evenly as possible. This requires performing operations with several cells within the same block at the same time, and not rape them one by one. This means that we do not need instant rewriting of existing blocks at the first request of the OS, but caching of commands and data, adding new blocks to free space and delayed erasure of cells.

Today, F2FS support is already officially implemented in Linux (and hence in Android), but it does not provide any special advantages in practice. The main feature of this file system (deferred overwrite) has led to premature conclusions about its effectiveness. The old caching trick even fooled early versions of benchmarks, where F2FS showed an apparent advantage not by a few percent (as expected) or even several times, but orders of magnitude. It's just that the F2FS driver reported the execution of an operation that the controller was just planning to do. However, if the real performance gain in F2FS is small, then cell wear will definitely be less than when using the same ext4. Those optimizations that a cheap controller cannot do will be performed at the level of the file system itself.

Extents and bitmaps

While F2FS is perceived as exotic for geeks. Even Samsung's own smartphones still use ext4. Many consider it to be a further development of ext3, but this is not entirely true. This is more of a revolution than breaking the 2 TB per file barrier and simply increasing other metrics.

When computers were large and files were small, addressing was easy. Each file was allocated a certain number of blocks, the addresses of which were entered into the correspondence table. This is how the ext3 filesystem worked, which is still in use today. But in ext4, a fundamentally different way of addressing appeared - extents.

Extents can be thought of as inode extensions as discrete sets of blocks that are addressed in their entirety as contiguous sequences. One extent can contain a whole medium-sized file, and for large files, it is enough to allocate a dozen or two extents. This is much more efficient than addressing hundreds of thousands of small blocks of four kilobytes.

The writing mechanism itself has changed in ext4. Now the distribution of blocks occurs immediately in one request. And not in advance, but just before writing data to disk. Delayed multi-block allocation allows you to get rid of unnecessary operations that ext3 sinned: in it, blocks for a new file were allocated immediately, even if it completely fit in the cache and was scheduled to be deleted as temporary.

FAT restricted diet

Besides balanced trees and their modifications, there are other popular logical structures. There are file systems with a fundamentally different type of organization - for example, linear. You probably use at least one of them a lot.

Mystery

Guess the riddle: at twelve she began to gain weight, by sixteen she was a stupid fat woman, and by thirty-two she became fat, and remained a simpleton. Who is she?

That's right, this is a story about the FAT file system. Compatibility requirements gave her a bad inheritance. On floppy disks it was 12-bit, on hard disks - at first it was 16-bit, and to this day it has come down as 32-bit. In each subsequent version, the number of addressable blocks increased, but in the very essence nothing changed.

The still popular FAT32 file system appeared twenty years ago. Today it is still primitive and does not support ACLs, disk quotas, background compression, or other modern data optimization technologies.

Why is FAT32 needed these days? All the same for compatibility purposes only. Manufacturers rightly believe that any OS can read a FAT32 partition. Therefore, they create it on external hard drives, USB Flash and memory cards.

How to free up flash memory on your smartphone

MicroSD (HC) cards used in smartphones are formatted to FAT32 by default. This is the main obstacle to installing applications on them and transferring data from internal memory. To overcome it, you need to create an ext3 or ext4 partition on the card. All file attributes (including owner and access rights) can be transferred to it, so any application can work as if it was launched from internal memory.

Windows cannot create more than one partition on flash drives, but for this you can run Linux (at least in a virtual machine) or an advanced utility for working with logical partitioning - for example, MiniTool Partition Wizard Free. Having found an additional primary partition with ext3 / ext4 on the card, the Link2SD application and similar ones will offer much more options than in the case of a single FAT32 partition.

Another argument in favor of FAT32 is the lack of journaling, which means faster write operations and less wear on NAND Flash memory cells. In practice, the use of FAT32 leads to the opposite and gives rise to many other problems.

Flash drives and memory cards just die quickly because any change in FAT32 causes the overwriting of the same sectors where two chains of file tables are located. I saved the entire web page, and it was rewritten a hundred times - with each addition of another small GIF to the flash drive. Launched the portable software? He created temporary files and constantly changes them during work. Therefore, it is much better to use NTFS on flash drives with its fault-tolerant $ MFT table. Small files can be stored directly in the main file table, and its extensions and copies are written to different areas of flash memory. In addition, NTFS indexing makes searches faster.

INFO

For FAT32 and NTFS, theoretical nesting level limits are not specified, but in practice they are the same: only 7707 subdirectories can be created in a first-level directory. Lovers of nesting dolls will appreciate it.

Another problem that most users face is that it is impossible to write a file larger than 4 GB to a FAT32 partition. The reason is that in FAT32 the file size is described by 32 bits in the file allocation table, and 2 ^ 32 (minus one, to be precise) just gives four gigs. It turns out that neither a movie in normal quality nor a DVD image can be recorded on a freshly purchased flash drive.

Copying large files is still half the trouble: when you try to do this, the error is at least immediately visible. In other situations, FAT32 acts as a time bomb. For example, you copied portable software to a USB flash drive and at first you can use it without any problems. After a long time, one of the programs (for example, accounting or mail) has a database bloated, and ... it just stops updating. The file cannot be overwritten because it has reached the 4 GB limit.

A less obvious problem is that in FAT32, the creation date of a file or directory can be specified with an accuracy of two seconds. This is not sufficient for many cryptographic applications that use timestamps. The low precision of the date attribute is another reason why FAT32 is not considered a complete file system from a security point of view. However, its weaknesses can be used for your own purposes. For example, if you copy any files from an NTFS partition to a FAT32 volume, they will be cleared of all metadata, as well as inherited and specially set permissions. FAT just doesn't support them.

exFAT

Unlike FAT12 / 16/32, exFAT has been designed specifically for USB Flash and large (≥ 32 GB) memory cards. Extended FAT eliminates the aforementioned disadvantage of FAT32 - overwriting the same sectors on any change. As a 64-bit system, it has practically no meaningful limits on the size of a single file. Theoretically, it can be 2 ^ 64 bytes (16 EB) long, and cards of this size will not appear soon.

Another major difference in exFAT is its support for Access Control Lists (ACLs). This is no longer that simpleton from the nineties, but the closed format hinders the implementation of exFAT. ExFAT support is fully and legally implemented only in Windows (starting from XP SP2) and OS X (starting from 10.6.5). On Linux and * BSD, it is supported either with restrictions or not entirely legally. Microsoft requires licensing to use exFAT, and there are many legal disputes in this area.

Btrfs

Another prominent example of B-tree file systems is called Btrfs. This FS appeared in 2007 and was originally created in Oracle with an eye to working with SSD and RAID. For example, it can be dynamically scaled: create new inodes on the live system, or divide a volume into subvolumes without allocating free space to them.

The copy-on-write mechanism implemented in Btrfs and full integration with the Device mapper kernel module allow you to make almost instant snapshots via virtual block devices. Precompressed data (zlib or lzo) and deduplication speed up basic operations while extending the lifetime of flash memory. This is especially noticeable when working with databases (compression is achieved by 2–4 times) and small files (they are written in orderly large blocks and can be stored directly in the "leaves").

Btrfs also supports full journaling (data and metadata), volume checking without unmounting, and many other modern features. Btrfs code is published under the GPL license. This file system has been maintained as stable on Linux since kernel 4.3.1.

Flight logs

Almost all more or less modern file systems (ext3 / ext4, NTFS, HFSX, Btrfs and others) belong to the general group of journaled ones, since they keep records of changes made in a separate log (journal) and check with it in case of failure during disk operations ... However, the level of verbosity and fault tolerance of these file systems is different.

Ext3 supports three logging modes: loopback, sequenced, and full logging. The first mode implies recording only general changes (metadata), performed asynchronously with respect to changes in the data itself. In the second mode, the same metadata recording is performed, but strictly before any changes are made. The third mode is equivalent to full logging (changes in both metadata and the files themselves).

Only the latter option ensures data integrity. The other two only speed up the identification of errors during the check and guarantee the restoration of the integrity of the file system itself, but not the contents of the files.

NTFS logging is similar to ext3's second logging mode. Only changes to the metadata are recorded in the log, and the data itself may be lost in the event of a failure. This NTFS journaling method was not conceived as a way to achieve maximum reliability, but only as a trade-off between performance and fault tolerance. This is why people accustomed to working with fully journaling systems consider NTFS to be pseudo-journaled.

The NTFS approach is somewhat better than the default in ext3. In NTFS, checkpoints are additionally created periodically to ensure that all previously pending disk operations are completed. Checkpoints have nothing to do with restore points in \ System Volume Infromation \. These are just overhead entries in the log.

Practice shows that such partial NTFS journaling in most cases is enough for trouble-free operation. After all, even with a sharp power outage, disk devices do not de-energize instantly. The power supply unit and numerous capacitors in the drives themselves provide just that minimum energy reserve, which is enough to complete the current write operation. Modern SSDs, with their speed and economy, usually have enough energy to perform pending operations. An attempt to switch to full logging would reduce the speed of most operations by several times.

We connect third-party file systems in Windows

The use of file systems is limited by their support at the OS level. For example, Windows does not understand ext2 / 3/4 and HFS +, but sometimes you need to use them. This can be done by adding the appropriate driver.

WARNING

Most drivers and plugins for supporting third-party file systems have their limitations and do not always work stably. They can interfere with other drivers, antivirus and virtualization programs.

Open driver for reading and writing ext2 / 3 partitions with partial ext4 support. The latest version supports extents and partitions up to 16 TB. LVM, ACLs and extended attributes are not supported.

There is a free plugin for Total Commander. Supports reading ext2 / 3/4 partitions.

coLinux is an open source and free port of the Linux kernel. Together with a 32-bit driver, it allows you to run Linux on Windows 2000 through 7 without using virtualization technologies. Supports 32-bit versions only. The development of the 64-bit modification was canceled. coLinux allows, among other things, to organize access from Windows to ext2 / 3/4 partitions. Project support was suspended in 2014.

Windows 10 may already have native support for Linux-specific file systems, it's just hidden. These thoughts are suggested by the kernel-level driver Lxcore.sys and the LxssManager service, which is loaded as a library by the Svchost.exe process. For more on this, see Alex Ionescu's talk "The Linux Kernel Hidden Inside Windows 10," which he presented at Black Hat 2016.

ExtFS for Windows is a paid driver released by Paragon. It works on Windows 7 to 10, supports read / write access to ext2 / 3/4 volumes. Provides almost complete ext4 support on Windows.

HFS + for Windows 10 is another proprietary driver from Paragon Software. Despite the name, it works in all versions of Windows starting from XP. Provides full access to HFS + / HFSX file systems on disks with any partition (MBR / GPT).

WinBtrfs is an early development of the Btrfs driver for Windows. Already in version 0.6, it supports both read and write access to Btrfs volumes. Can handle hard and symbolic links, supports alternate data streams, ACL, two types of compression and asynchronous read / write mode. So far WinBtrfs cannot use mkfs.btrfs, btrfs-balance and other utilities to maintain this file system.

File System Capabilities and Limitations: Pivot Table

File system	Mac-si-mal-ny volume-size	Pre-del-size of one file	Length by own file name	Length of full file name (including path from root)	Pre-del number of files and / or catalogs	Accuracy of specifying the date of the file / catalog	Rights dos-tu-pa	Hard links	Sim-free links	Snap-shots	Compressing data in the background	Cipher-ro-va-tion of data in the background	Grandfather-pli-ka-tion of data
FAT16	2 GB in 512 byte sectors or 4 GB in 64 KB clusters	2 GB	255 bytes with LFN	-	-	-	-	-	-	-	-	-	-
FAT32	8 TB in 2 KB sectors	4 GB (2 ^ 32 - 1 byte)	255 bytes with LFN	up to 32 subdirectories with CDS	65460	10ms (create) / 2s (change)	No	No	No	No	No	No	No
exFAT	≈ 128 PB (2 ^ 32-1 clusters of 2 ^ 25-1 bytes) theoretically / 512 TB due to third-party limitations	16 EB (2 ^ 64 - 1 byte)			2796202 in catalog	10 ms	ACL	No	No	No	No	No	No
NTFS	256 TB in 64 KB clusters or 16 TB in 4K clusters	16 TB (Win 7) / 256 TB (Win 8)	255 Unicode characters (UTF-16)	32,760 Unicode characters, but no more than 255 characters per element	2^32-1	100 ns	ACL	Yes	Yes	Yes	Yes	Yes	Yes
HFS +	8 EB (2 ^ 63 bytes)	8 EB	255 Unicode characters (UTF-16)	not limited separately	2^32-1	1 sec	Unix, ACL	Yes	Yes	No	Yes	Yes	No
APFS	8 EB (2 ^ 63 bytes)	8 EB	255 Unicode characters (UTF-16)	not limited separately	2^63	1 ns	Unix, ACL	Yes	Yes	Yes	Yes	Yes	Yes
Ext3	32 TB (theoretical) / 16 TB in 4K clusters (due to limitations of e2fs programs)	2 TB (theoretical) / 16 GB for older programs	255 Unicode characters (UTF-16)	not limited separately	-	1 sec	Unix, ACL	Yes	Yes	No	No	No	No
Ext4	1 EB (theoretical) / 16 TB in 4K clusters (due to limitations of e2fs programs)	16 TB	255 Unicode characters (UTF-16)	not limited separately	4 billion	1 ns	POSIX	Yes	Yes	No	No	Yes	No
F2FS	16 TB	3.94 TB	255 bytes	not limited separately	-	1 ns	POSIX ACL	Yes	Yes	No	No	Yes	No
BTRFS	16 EB (2 ^ 64 - 1 byte)	16 EB	255 ASCII characters	2 ^ 17 bytes	-	1 ns	POSIX ACL	Yes	Yes	Yes	Yes	Yes	Yes

I already announced it once on my blog, then nothing was really known about it, and now it was time for a short but more consistent acquaintance with the newly made ReFS.

20 years later

However, everything has a limit, and so does the capabilities of file systems. Today the capabilities of NTFS have come to their limits: checking large storage media takes too long, the "Journal" slows down access, and the maximum file size has almost been reached. Realizing this, Microsoft implemented a new file system in Windows 8 - ReFS (Resilient File System). ReFS is considered to provide the best data protection for large and fast hard drives. Surely it has its drawbacks, but before the start of the truly massive use in Windows 8, it is difficult to talk about them.

So for now, let's try to understand the internals and advantages of ReFS.

ReFS was originally codenamed "Protogon". For the first time I told about her to the general public about a year ago Stephen Sinofsky- President of the Windows Division at Microsoft, responsible for the development and marketing of Windows and Internet Explorer.

He told in these words:

“NTFS is the most widely used, advanced and feature rich file system today. But rethinking Windows, and we are currently developing Windows 8, we do not stop there. Therefore, along with Windows 8, we are also introducing a completely new file system. ReFS is built on top of NTFS so that it retains critical interoperability while being designed and engineered to meet the needs of the next generation of storage technologies and scenarios.

In Windows 8, ReFS will only be introduced as part of Windows Server 8, the same approach we used to implement all previous file systems. Of course, at the application level, clients will be given access to ReFS data in the same way as NTFS data. Keep in mind that NTFS is still the leading technology in the industry for PC file systems. ”

Indeed, we first saw ReFS in the server operating system Windows Server 8. The new file system was not developed from scratch. For example, ReFS uses the same APIs as NTFS to open, close, read, and write files. Also, many familiar features migrated from NTFS - for example, disk encryption Bitlocker and symbolic links for libraries. But it disappeared, for example, data compression and a number of other functions.

The main innovations in ReFS are focused on the creation and management of file and folder structures. Their task is to provide automatic error correction, maximum scaling and operation in always online mode.

ReFS architecture

The disk implementation of ReFS structures is fundamentally different from other Microsoft file systems. Microsoft developers were able to realize their ideas by applying the B-tree concept, which is well known from databases in ReFS. The folders in the file system are structured as tables with files as entries. These, in turn, receive certain attributes added as sub-tables, creating a hierarchical tree structure. Even the free disk space is organized in tables.

Along with the real 64-bit numbering of all system elements, this eliminates the appearance of "bottlenecks" during its further scaling

As a result, the core of the ReFS system is the object table - the central directory that lists all the tables on the system. There is an important advantage to this approach: ReFS has abandoned complex log management and commits new file information to free space - this prevents it from being overwritten.

« Leaves Catalog"Are typed entries. There are three basic types of records for a folder object: directory descriptor, index record, and nested object descriptor. All such records are packaged as a separate B ± tree with a folder identifier; the root of this tree is leaf B ± of the “Catalog” tree, which allows packing almost any number of records into a folder. At the bottom level, in sheets B ± of the folder tree, there is primarily a directory descriptor record containing basic data about the folder (name, "standard information", file name attribute, etc.).

Further in the catalog are placed index records: short structures containing information about the items contained in the folder. These records are significantly shorter than NTFS, which is less metadata overload on the volume.

At the end are the catalog entries. For folders, these elements contain the name of the pack, the identifier of the folder in the "Catalog" and the structure of the "standard information". For files, there is no identifier - instead, the structure contains all the basic data about the file, including the root B ± of the file's chunk tree. Accordingly, a file can consist of almost any number of fragments.

Like NTFS, ReFS fundamentally distinguishes between file information (metadata) and file content (user data). However, protective functions are provided to both in the same way. Metadata is protected by default with checksums - the same protection (if desired) can be given to user data. These checksums are located on the disk at a safe distance from each other - so it will be easier to recover data in the event of an error.

The metadata size of an empty file system is about 0.1% of the size of the file system itself (i.e. about 2 GB per 2 TB volume). Some core metadata is duplicated for better crash resilience

The ReFS variant we saw in Windows Server 8 Beta, has support for only 64KB data clusters and 16KB metadata clusters. For now, the "Cluster Size" parameter is ignored when creating a ReFS volume and is always assumed to be default. When formatting the file system, 64 KB is also the only available cluster size option.

We admit that this cluster size is more than enough to organize file systems of any size. A side effect, however, is a noticeable redundancy in data storage (a 1-byte file on disk will take up a full 64 KB block).

ReFS security

In terms of file system architecture, ReFS has all the tools you need to safely recover files even after a major hardware failure. The main disadvantage of the journaling system in the NTFS file system and the like is that updating the disk can damage the previously recorded metadata in case of a power failure during recording - this effect has already received a stable name: the so-called. " dangling recording».

To prevent dangling records, Microsoft took a new approach in which parts of the metadata structures contain their own identifiers, which allows you to verify the ownership of the structures; metadata links contain 64-bit checksums of the blocks being referenced.

Any change in the structure of metadata takes place in two stages. First, a new (modified) copy of the metadata is created in free disk space, and only after that, if successful, the atomic update operation transfers the link from the old (unchanged) to the new (changed) metadata area. Here, it eliminates the need for logging by automatically preserving data integrity.

However, the described scheme does not apply to user data, so any changes to the contents of the file are written directly to the file. The file is deleted by rebuilding the metadata structure, which saves the previous version of the metadata block on disk. This approach allows you to recover deleted files until they are overwritten with new user data.

A separate topic is ReFS fault tolerance in case of disk damage. The system is able to identify all forms of disc damage, including lost or stored in the wrong place of recording, as well as the so-called. bit decay(degradation of the data on the media)

When the "Integral Streams" option is enabled, ReFS also checks the contents of files against checksums and always writes changes to files in a third-party location. This gives you the assurance that preexisting data will not be lost when overwritten. The checksums are updated automatically when data is written, so if the write fails, the user will have a version of the file to check.

Another interesting topic on ReFS security is interacting with Storage spaces... ReFS and Storage spaces designed to complement each other as two components of a single storage system. Besides improving performance Storage spaces protect data from partial and complete disk failures by storing copies on multiple disks. During read failures Storage spaces can read copies, and in case of write failures (even with complete loss of media data during reading / writing), it is possible to "transparently" redistribute the data. As practice shows, most often such a failure has nothing to do with the medium - it occurs due to data corruption, or due to data loss or storage in the wrong place.

These are the kinds of failures that ReFS can detect using checksums. Upon detecting a failure, ReFS communicates with Storage spaces in order to read all possible copies of the data, and selects the correct copy based on the checksum check. The system then gives Storage spaces command to recover damaged copies based on true copies. All this happens transparently from an applied point of view.

As stated on Microsoft's website for Windows Server 8, checksums are always enabled for ReFS metadata, and assuming the volume is mirrored Storage spaces, automatic correction is also enabled. All coherent streams are protected in the same way. This creates an end-to-end solution with a high degree of integrity for the user, through which relatively unreliable storage can be made highly reliable.

The mentioned integrity streams protect the file contents from all kinds of data corruption. However, this characteristic is inapplicable in some cases.

For example, some applications prefer neat file storage management with some sort of file sorting on disk. Because coherent streams redistribute blocks every time the contents of a file change, the layout of files is too unpredictable for these applications. Database systems are a prime example of this. As a rule, such applications independently keep track of the checksums of the contents of the files and have the ability to check and correct the data by directly interacting with APIs.

How ReFS works in the event of disk corruption or storage failure, I think it's clear. It can be more difficult to identify and overcome data loss associated with “ bit decay"When undetected damage to rarely read parts of the disk begins to grow rapidly. By the time such damage is read and detected, it may have already affected copies, or data may be lost due to other failures.

To overcome the process bit decay, Microsoft has added a background system task that periodically flushes metadata and data from consistent streams on a ReFS volume in mirrored storage. The cleanup is done by reading all redundant copies and checking for correctness using ReFS checksums. If the checksums do not match, the erroneous copies are corrected with good copies.

There remains a threat that can be conventionally called the "sysadmin's nightmare." There are cases, although rare, when even a volume in mirrored space can be damaged. For example, the memory of a failed system can corrupt data that can then end up on disk and damage redundant copies. In addition, many users may decide not to use mirrored storage spaces for ReFS.

For such cases, when the volume becomes damaged, ReFS performs "repair" - a function that removes data from the namespace on the working volume. Its mission is to prevent irreparable damage that could affect the availability of correct data. For example, if a single file in a directory is damaged and cannot be automatically repaired, ReFS will remove that file from the file system namespace, restoring the rest of the volume.

We are used to the fact that the file system cannot open or delete a damaged file, and the administrator cannot do anything about it.

But since ReFS can recover corrupted data, the administrator can restore this file from a backup, or use the application to recreate it, avoiding the need to shut down the system. This means that the user or administrator will no longer need to perform the verification and repair procedure offline. For servers, this enables large volumes of data to be deployed without the risk of long battery life due to damage.

ReFS in practice

Of course, the practicality and convenience (or the opposite qualities) of ReFS can be judged only after computers with Windows 8 become widespread and at least six months of active work with them have passed. In the meantime, potential G8 users have more questions than answers.

For example, this: will it be possible in Windows 8 to easily and simply convert data from NTFS to ReFS and vice versa? Microsoft says no built-in format conversion function is expected, but the information can still be copied. The scope of ReFS is obvious: at first, it can only be used as a large data manager for the server (in fact, it is already in use). There will be no external drives with ReFS yet - only internal ones. Obviously, over time, ReFS will be equipped with more features and will be able to replace the legacy system.

Microsoft says that most likely this will happen with the release of the first service pack for Windows 8.

Microsoft also claims to have tested ReFS:

“Using a complex, extensive suite of tens of thousands of tests that have been written for NTFS for more than two decades. These tests recreate the sophisticated deployment conditions that we think the system might encounter, for example, during a power failure, with problems often associated with scalability and performance. Therefore, we can say that the ReFS system is ready for a test deployment in a controlled environment. "

At the same time, however, the developers admit that as the first version of a large file system, ReFS will probably require caution in handling:

“We are not characterizing ReFS for Windows 8 as a beta release. The new file system will be ready for release when Windows 8 gets out of beta, because nothing is more important than data reliability. So, unlike any other aspect of the system, this requires a conservative approach to initial use and testing. "

In many ways, it is for this reason that ReFS will be introduced into use according to a phased plan. First as a storage system for Windows Server, then as storage for users, and finally as a boot volume. However, a similar "cautious approach" has been used in the past when releasing new file systems.

In this article, we'll figure it out what features does ReFS provide and how is it better than the NTFS file system... How to recover data from ReFS disk space. Microsoft's new ReFS file system was originally introduced in Windows Server 2012. It is also included in Windows 10 as part of the Disk Space tool. ReFS can be used for a pool of drives. With the release of Windows Server 2016, the file system has been improved and will soon be available in the new version of Windows 10.

What features does ReFS provide and how is it better than the current NTFS system?

Content:

What does ReFS mean?

Abbreviation for Resilient File System ReFS is a new system based on NTFS. At this stage, ReFS does not offer a comprehensive replacement for NTFS for home users. The file system has its advantages and disadvantages.

ReFS is designed to solve basic NTFS problems. It is more resilient to data corruption, handles increased workload better, and scales easily to very large file systems. Let's see what this means?

ReFS protects data from corruption

The file system uses checksums for metadata, and can also use checksums for file data. When reading or writing a file, the system checks the checksum to make sure it is correct. Thus, the detection of corrupted data in real time is carried out.

ReFS is integrated with the Disk Space feature. If you have configured a mirrored data store, Windows uses ReFS to detect and automatically repair file system corruption by copying data from another drive. This feature is available in both Windows 10 and Windows 8.1.

If the file system detects damaged data that does not have an alternative copy for recovery, then ReFS will immediately delete such data from the disk. This does not require rebooting the system or unplugging the storage device, as is the case with NTFS.

The need to use the chkdsk utility completely disappears, since the file system is automatically corrected immediately at the time of the error. The new system is resistant to other types of data corruption. NTFS writes file metadata directly when writing file metadata. If a power outage or a computer crash occurs during this time, you will receive data corruption.

When the metadata changes, ReFS creates a new copy of the data and associates the data with the file only after the metadata is written to disk. This eliminates the possibility of data corruption. This feature is called copy-to-write, and it is also present in other popular Linux operating systems: ZFS, BtrFS, and Apple's APFS file system.

ReFS removes some NTFS limitations

ReFS is more modern and supports much larger volumes and longer filenames than NTFS. In the long term, these are important improvements. In NTFS, the file name is limited to 255 characters; in ReFS, a file name can be up to 32768 characters. Windows 10 allows you to disable the character limit limit for NTFS file systems, but it is always disabled on ReFS volumes.

ReFS no longer supports short file names in DOS 8.3 format. On an NTFS volume, you can access C: \ Program Files \ v C: \ PROGRA ~ 1 \ to ensure compatibility with old software.

NTFS has a theoretical maximum size of 16 exabytes, while ReFS has a theoretical maximum size of 262,144 exabytes. Although it doesn't really matter now, computers are constantly evolving.

Which file system is faster than ReFS or NTFS?

ReFS was not designed to improve file system performance over NTFS. Microsoft has made ReFS much more efficient in very specific cases.

For example, when used with Disk Space, ReFS supports "real-time optimization". Let's say you have a storage pool with two disks, one for maximum performance and the other for capacity. ReFS will always write data to the faster disk for maximum performance. In the background, the file system will automatically move large chunks of data to slower disks for long-term storage.

In Windows Server 2016, Microsoft has improved ReFS to provide better performance for virtual machine functions. The Microsoft Hyper-V virtual machine takes advantage of these benefits (in theory, any virtual machine can take advantage of ReFS).

For example, ReFS supports block cloning, which speeds up the process of virtual machine cloning and checkpoint merges. To create a copy of the virtual machine, ReFS only needs to write the new metadata to disk and provide a link to the existing data. This is because in ReFS, multiple files can point to the same underlying data on disk.

When the virtual machine writes new data to disk, it is written to a different location, and the original virtual machine data remains on disk. This greatly speeds up the cloning process and requires much less disk bandwidth.

ReFS also offers a new feature "Rare VDL" which allows ReFS to quickly write zeros to a large file. This significantly speeds up the creation of a new, empty, fixed size virtual hard disk (VHD) file. On NTFS this operation can take 10 minutes, on ReFS it can take several seconds.

Why ReFS Can't Replace NTFS

Despite a number of advantages, ReFS cannot yet replace NTFS. Windows cannot boot from ReFS partition and requires NTFS. ReFS does not support NTFS features such as data compression, file system encryption, hard links, extended attributes, data deduplication, and disk quotas. But unlike NTFS, ReFS allows full drive encryption with BitLocker, including system drive structures.

Windows 10 does not allow formatting a partition with ReFS, this file system is only available within Disk Space. ReFS protects data used on pools of multiple hard drives from damage. In Windows Server 2016, you can format volumes using ReFS instead of NTFS. Such a volume can be used to store virtual machines, but the operating system can still only boot from NTFS.

Hetman Partition Recovery allows you to analyze disk space managed by the ReFS file system using a signature analysis algorithm. By analyzing the device sector by sector, the program finds specific sequences of bytes and displays them to the user. Recovering data from ReFS disk space is no different from working with the NTFS file system:

Download and install the program;
Analyze the physical disk that is included in the disk space;
Select and save the files you want to recover;
Repeat steps 2 and 3 for all disks included in the disk space.

The future of the new file system is rather hazy. Microsoft may fine-tune ReFS to replace the obsolete NTFS in all versions of Windows. At the moment, ReFS cannot be used universally and serves only for certain tasks.

If you have already installed and worked with new operating systems from Microsoft: Windows Server 2012 and Windows 8, you probably already noticed that new volumes can now be formatted in the ReFS file system. What is a file system ReFS? ReFS stands for Resilient File System, i.e. in Russian "Fault-tolerant file system".

Microsoft sees the ReFS file system as the successor to the most popular file system at the moment, NTFS, whose technological capabilities have already come to their limits. In particular, when working with large data carriers, difficulties arise with their work: this is too long when performing the error check operation, and the slow operation of the journal, and reaching the limits on the maximum file size on the NTFS file system.

Features of the ReFS file system

Most of ReFS's innovations lie in the creation and management of file and folder structures. These features are implemented for automatic error correction, high scalability, and Always Online operation. Folders in the ReFS file system are structured as tables with files as records, which in turn can have their own attributes, organized as sub-tables, implementing the hierarchical B + tree structure familiar from databases. Free disk space is also organized in tables.

When developing ReFS, the following goals were pursued:

Ensuring maximum compatibility with existing NTFS functions, and getting rid of unnecessary ones that complicate the system
Verification and automatic data correction.
Scalability.
Flexibility of the architecture using a function that was actually conceived for ReFS.

Key features of ReFS

Increased limits on the size of partitions, directories and files (table below)
Integrity of metadata with checksums.
A special method of recording to disk - Integrity streams, which provides additional data protection in case of damage to a part of the disk.
New transaction model "allocate on write" (copy on write)
Disk scrubbing - background disk scrubbing technology
The ability to organize storage pools that can be used in virtualization, incl. to ensure fault tolerance of virtual machines and load balancing.
Data sriping is used to improve performance
Recovery of data around the damaged area on the disk.

Limitations of the ReFS file system

Supported NTFS Features

ReFS inherits many of the features and semantics of its predecessor, NTFS, including:

BitLocker encryption
USN magazine
access control lists (ACL)
symbolic links for libraries
mount points
junction points
reparse points

All data on the ReFS file system will be accessible through the same APIs that are currently used to access NTFS partitions.

ReFS deprecated the following NTFS features:

data compression
EFS file-level encryption
short file names 8.3
Hard links

ReFS in Windows 8

ReFS support was introduced in Windows 8 and Windows Server 2012, and only for data volumes. That is, ReFS partitions cannot be used to install and boot from an operating system. Over time, ReFS will be equipped with more features and will be able to completely replace the outdated NTFS system. All new features are likely to appear in the first Service Pack for Windows 8.

In addition, ReFS cannot yet be used for removable and portable storage devices (ReFS is currently only used for internal media).

The frustrating point is that existing NTFS volumes cannot be converted to ReFS on the fly. The data will have to be transferred by ordinary copying.

The volume can be formatted to the ReFS file system through the Disk Management console. However, additional options, such as enabling consistency checking, can only be enabled from the command line.

For example, you can enable ReFS consistency checking with the command:

Format / fs: refs / q / i: enable

Disable consistency check.