DotNet Programming World

Thursday, November 10, 2011

CSharp File Handling Comparason

This is not my own idea but copied from somewhere in internet.
I appreciate someone who wrote this.



You need to handle file IO in your application written in the C# programming language, utilizing the .NET framework's powerful methods. Test the methods in the System.IO namespace and also look at some performance issues. Here are many examples and some benchmarks of the System.IO namespace from the base class library.

Add using System.IO


First, the author's experience is that the .NET Framework provides excellent file handling/IO methods. They are optimized in the framework so that you don't need to hand-optimize buffer sizes or other mechanics. Make sure to include the IO namespace, as shown here.

//
// Include this namespace for all the examples.
//

using System.IO;

Various File methods in .NET


Here we see a table showing some of the most useful and popular File methods available. Many C# programmers use these methods quite extensively, particularly the ones dealing with lines and text.

File.ReadAllBytes
Useful for files not stored as plain text.
See example near the bottom.

File.ReadAllLines
Microsoft: "Opens a file, reads all lines of the file with the
specified encoding, and closes the file."
[See the benchmark below]

File.ReadAllText
Returns the contents of the text file at the specified path as
a string. [See the benchmark below]

File.WriteAllBytes
Not covered here.
It can be used in conjunction with File.ReadAllBytes.

File.WriteAllLines
Stores a string array in the specified file, overwriting the
contents. Shown in an example below.

File.WriteAllText
Writes the contents of a string to a text file.

File.AppendAllText
Use to append the contents string to the file at path.
Microsoft: "Appends the specified string to the file,
creating the file if it doesn't already exist."

File.AppendText
Not covered here in this article.
You can also use standard StreamWriter code.

Reading lines with File.ReadAllLines


Here you want to read all the lines in from a file and place them in an array. The following code reads in each line in the file "file.txt" into an array. This is efficient code, but this article contains performance metrics later on.

~~~ Program that uses ReadAllLines (C#) ~~~

using System.IO;

class Program
{
static void Main()
{
// Read in every line in specified file.
// This will store all lines in an array in memory,
// which you may not want or need.

string[] lines = File.ReadAllLines("file.txt");
foreach (string line in lines)
{
// Do something with line
if (line.Length > 80)
{
// Example code
}
}
}
}

Reading lines with StreamReader ReadLine


Here we see how to use the ReadLine method in a loop. This method is not part of the File static class, but it is in the System.IO namespace. We will compare it to the File.ReadAllLines method.


See Using StreamReader.

--- Program that uses ReadLine (C#) ---

using System.IO;

class Program
{
static void Main()
{
// Read in every line in the file.
using (StreamReader reader = new StreamReader("file.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
// Do something with line
string[] parts = line.Split(',');
}
}
}
}

Benchmarking file handling methods


Here we compare several of the File and StreamReader methods. The goal of this benchmark is to help developers write code that is easy-to-understand and also very fast and resource-friendly. The following two graphs show how the above two code blocks perform.

~~~ File read benchmark for 52,930 lines ~~~
The test was repeated 200 times.
StreamReader was faster.

File.ReadAllLines: 28.226 ms
Using StreamReader ReadLine: 17.543 ms [faster]

~~~ File read benchmark for 20 lines ~~~
The test was repeated 20000 times.
StreamReader was faster.

File.ReadAllLines: 0.487 ms
Using StreamReader ReadLine: 0.480 ms [faster]

Results. StreamReader is much faster for large files with 10,000+ lines, but the difference for smaller files is negligable. As always, plan for varying sizes of files, and use File.ReadAllLines only when performance isn't critical.

File.ReadAllText and alternative


Here we want to resolve whether File.ReadAllText was performing well. To answer this, the ReadAllText method was benchmarked against StreamReader; the result was that on a 4 KB file it was almost 40% slower.

::: Program that uses ReadAllText and StreamReader (C#) :::

using System.IO;

class Program
{
static void Main()
{
// A.
// Read in file with File class.

string text1 = File.ReadAllText("file.txt");

// B.
// Alternative: use custom StreamReader method.

string text2 = FileTools.ReadFileString("file.txt");
}
}

public static class FileTools
{
public static string ReadFileString(string path)
{
// Use StreamReader to consume the entire text file.
using (StreamReader reader = new StreamReader(path))
{
return reader.ReadToEnd();
}
}
}

StreamReader helper. In some projects, it would be worthwhile to use the above ReadFileString custom static method. In a project that opens hundreds of small files, it would save 0.1 milliseconds per file.

--- Benchmark of file text read methods (C#) ---

File.ReadAllText: 155 ms
FileTools.ReadFileString: 109 ms [faster]

Using List with File.ReadAllLines


Here we look at a usage of the List constructed type with file handling methods. List and ArrayList are extremely useful data structures for C# programmers, as they allow object collections to rapidly expand or shrink. Here we look at how you can use LINQ to get a List of lines from a file in one line.

~~~ Program that uses ReadAllLines with List (C#) ~~~

using System.Collections.Generic;
using System.IO;
using System.Linq;

class Program
{
static void Main()
{
// Read in all lines in the file, and then convert to List with LINQ.
List<string> fileLines = File.ReadAllLines("file.txt").ToList();
}
}

Counting lines with File.ReadAllLines


Here we need to count the number of lines in a file but don't want to write lots of code to do it. Note that the example here doesn't have ideal performance characteristics. We reference the Length property on the array returned.


See Line Count File Method.

--- Program that counts lines (C#) ---

using System.IO;

class Program
{
static void Main()
{
// Another method of counting lines in a file.
// This is NOT the most efficient way. It counts empty lines.

int lineCount = File.ReadAllLines("file.txt").Length;
}
}

Checking lines in files


Here we look at a method that tests each line in a file using an imperative statement from LINQ. Does a line containing a specific string exist in the file? Maybe you want to see if a name or location exists in a line in the file. Here we can harness the power of LINQ to find any matching line. See also the Contains method on the List type.

--- Program that uses LINQ on file (C#) ---

using System.IO;
using System.Linq;

class Program
{
static void Main()
{
// One way to see if a certain string is a line
// in the specified file. Uses LINQ to count elements
// (matching lines), and then sets |exists| to true
// if more than 0 matches were found.

bool exists = (from line in File.ReadAllLines("file.txt")
where line == "Some line match"
select line).Count() > 0;
}
}

Persisting data with File.WriteAllLines


Here we look at how you can write an array to a file. When you are done with your in-memory processing, you often need to write the data to disk. Fortunately, the File class offers an excellent WriteAllLines method. It receives the file path and then the array to write. This will replace all the file contents.

::: Program that writes array to file (C#) :::

using System.IO;

class Program
{
static void Main()
{
// Write a string array to a file.
string[] stringArray = new string[]
{
"cat",
"dog",
"arrow"
};
File.WriteAllLines("file.txt", stringArray);
}
}

::: Output of the program :::

cat
dog
arrow

Appending text to files


Here we mention a way you can append text to files in a simple method. The previous example will replace the file's contents, but for a log file or error listing, we must append to the file. Note that we could read in the file, append to that in memory, and then write it out completely again, but that's slow.


See File.AppendAllText Method for Appending to Files.

Reading all bytes


Here we use File.ReadAllBytes to read in an image, PNG, to memory. One example usage of this sample is to cache an image in memory for performance. This works very well and greatly outperforms reading in the image each time.

=== Program that caches binary file (C#) ===

static class ImageCache
{
static byte[] _logoBytes;
public static byte[] Logo
{
get
{
// Returns logo image bytes.
if (_logoBytes == null)
{
_logoBytes = File.ReadAllBytes("Logo.png");
}
return _logoBytes;
}
}
}

Summary


In this tutorial, we saw several methods and patterns for using System.IO in the C# programming language. The author's experience is that C# and .NET is excellent with file handling, and one benchmark he has seen measures it as even faster than C++ on Windows in its default configuration. Nearly every medium or larger size program will need to use file input/output, and this article provides a sampling of some of the clearest methods for this purpose.

3 Comments:

Post a Comment

<< Home