File processing systems require more memory and processing power than databases.

"They're the same"

Yes, storing data is just storing data. At the end of the day, you have files. You can store lots of stuff in lots of files & folders, there are situations where this will be the way. There is a well-known versioning solution [svn] that finally ended up using a filesystem-based model to store data, ditching their BerkeleyDB. Rare but happens. More info.

"They're quite different"

In a database, you have options you don't have with files. Imagine a textfile [something like tsv/csv] with 99999 rows. Now try to:

  • Insert a column. It's painful, you have to alter each row and read+write the whole file.
  • Find a row. You either scan the whole file or build an index yourself.
  • Delete a row. Find row, then read+write everything after it.
  • Reorder columns. Again, full read+write.
  • Sort rows. Full read, some kind of sort - then do it next time all over.

There are lots of other good points but these are the first mountains you're trying to climb when you think of a file based db alternative. Those guys programmed all this for you, it's yours to use; think of the likely [most frequent] scenarios, enumerate all possible actions you want to perform on your data, and decide which one works better for you. Think in benefits, not fashion.

Again, if you're storing JPG pictures and only ever look for them by one key [their id maybe?], a well-thought filesystem storage is better. Filesystems, btw, are close to databases today, as many of them use a balanced tree approach, so on a BTRFS you can just put all your pictures in one folder - and the OS will silently implement something like an early SQL query each time you access your files.

So, database or files?...
Let's see a few typical examples when one is better than the other. [These are no complete lists, surely you can stuff in a lot more on both sides.]

DB tables are much better when:

  • You want to store many rows with the exact same structure [no block waste]
  • You need lightning-fast lookup / sorting by more than one value [indexed tables]
  • You need atomic transactions [data safety]
  • Your users will read/write the same data all the time [better locking]

Filesystem is way better if:

  • You like to use version control on your data [a nightmare with dbs]
  • You have big chunks of data that grow frequently [typically, logfiles]
  • You want other apps to access your data without API [like text editors]
  • You want to store lots of binary content [pictures or mp3s]

TL;DR

Programming rarely says "never" or "always". Those who say "database always wins" or "files always win" probably just don't know enough. Think of the possible actions [now + future], consider both ways, and choose the fastest / most efficient for the case. That's it.

Skip to main content

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

In-memory database systems and technologies

  • Article
  • 11/18/2022
  • 2 minutes to read

In this article

Applies to: SQL Server [all supported versions]

This page is intended to serve as a reference page for in-memory features and technologies within SQL Server. The concept of an in-memory database system refers to a database system that has been designed to take advantage of larger memory capacities available on modern database systems. An in-memory database may be relational or non-relational in nature.

It is assumed often, that the performance advantages of an in-memory database system are mostly owing to it being faster to access data that is resident in memory rather than data that sitting on even the fastest available disk subsystems [by several orders of magnitude]. However, many SQL Server workloads can fit their entire working set in available memory. Many in-memory database systems can persist data to disk and may not always be able to fit the entire data set in available memory.

A fast volatile cache that fronts a considerably slower but durable media has been predominant for relational database workloads. It necessitates particular approaches to workload management. The opportunities presented by faster memory transfer rates, greater capacity, or even persistent memory facilitates the development of new features and technologies that can spur new approaches to relational database workload management.

Hybrid buffer pool

Applies to: SQL Server [all supported versions]

Hybrid buffer pool expands the buffer pool for database files residing on byte-addressable persistent memory storage devices for both Windows and Linux platforms with SQL Server 2019 [15.x].

Applies to: SQL Server [all supported versions]

SQL Server 2019 [15.x] introduces a new feature that is memory-optimized tempdb metadata, which effectively removes some contention bottlenecks and unlocks a new level of scalability for tempdb-heavy workloads.

For more information on recent tempb improvements including memory-optimized metadata in SQL Server 2019 [15.x] and newer features, review System Page Latch Concurrency Enhancements [Ep. 6] | Data Exposed.

In-memory OLTP

Applies to: SQL Server [all supported versions]

In-memory OLTP is a database technology available in SQL Server and SQL Database for optimizing performance of transaction processing, data ingestion, data load, and transient data scenarios.

Configuring persistent memory support for Linux

Applies to: SQL Server [all supported versions] - Linux

SQL Server 2019 [15.x] describes how to configure persistent memory [PMEM] using the ndctl utility persistent memory.

Persisted log Buffer

Service Pack 1 of SQL Server 2016 [13.x] introduced a performance optimization for write intensive workloads that were bound by WRITELOG waits. Persistent memory is used to store the log buffer. This buffer, which is small [20 MB per user database], has to be flushed to disk in order for the transactions written to the transaction log to be hardened. For write intensive OLTP workloads, this flushing mechanism can become a bottleneck. With the log buffer on persistent memory, the number of operations required to harden the log is reduced, improving overall transaction times and increasing workload performance. This process was introduced as Tail of Log Caching. However, there was a perceived conflict with Tail Log Backups and the traditional understanding that the tail of the log was the portion of the transaction log hardened but not yet backed up. Since the official feature name is Persisted Log Buffer, this is the name used here.

See Add persisted log buffer to a database.

Feedback

Submit and view feedback for

Why database approach is better than file processing approach?

It allows certain people or users of the database, administrators, to have more control than other users, whereas in file processing, all users have the same amount of control. Reduced data redundancy: Data is stored only one time in database while in the traditional process approach data may have been duplicated.

What are the two major weaknesses of file processing systems?

The file processing system has the following major disadvantages: Data redundancy and inconsistency.

Which of the following is not an advantage of a database approach?

Therefore, High acquisition costs are not the advantage of a database management system.

What is in memory database processing and what advantages does it provide quizlet?

An In-Memory database means all the data from source system is stored in a RAM memory. It provides faster access of data to multicore CPUs for information processing and analysis.

Chủ Đề