Results 1 to 2 of 2

Thread: [RESOLVED] HOW TO: Import / Export data from excel to c# (the easy way for dummies like me)

Threaded View

  1. #1

    Thread Starter
    Join Date
    Jul 2018

    Resolved [RESOLVED] HOW TO: Import / Export data from excel to c# (the easy way for dummies like me)

    Hi everyone,

    I firstly want to thank for the help that has been given to me in other threads, especially thanks to jmcilhinney. Now it's my turn to lend a hand, from my modest knowledge I want to share it with anyone that needs help.

    For a proyect I'm developing I need the ability of import and export data from and to excel sheets (Nothing specially difficult, just cells with simple values). In a first moment I thought it was going to be very easy, of course, I was wrong. In a first moment I found several libraries or let's say "methods", so I didn't know how to face the problem, I spent a lot time exploring the "in theory" easiest methods, but they didn't work to me, I found codes with parts underlined in red on visual studio, I didn't understand and I was not capable of make it work...

    After some days of work and a lot of learning I've archieve what I need, so I share it and provide the solution in the way I would have like to find.

    1.- ¿How to manage excel data in .NET and C#?

    The easiest way to export/import data from excel is Microsoft Excel 15.0 Object Library (I've not use it but it seems to be easy to implement). Despite it's the easier way and the first solution you find when google it, from the first moment I did not want to use it.

    The reason to refuse it is that requires Excel to be installed on the computer you run the program. I want my program to be able to read/write excel sheets even when Office or Excel is not present in the computer.

    If you look a method different than Microsoft Excel 15.0 Object Library you'll find a few libraries and even little classes developed for this purpose. I've try and finally I'm going to use NPOI. If you want further info about what NPOI is: NPOI official site.

    To sum up, its a free to use library that not only gives the ability of import/export simple sheets with text on its cells, it also can give format to the sheets.

    2.- How to install NPOI so I can use it in my proyect

    You don't have to download it and add .dll to your proyect, I've directy install it via NuGet Package Manager.

    In the .NET window open your proyect then go to Tools Menu -> NuGet Package Manager -> Manage NuGet Packages for the solution:

    Once you're in this screen write "NPOI" in the search field:

    Now select it, select your proyect and click install.

    3.- Export data to a new workbook

    The export procedure has been really easy to me, it took me a few minutes to make it work. I've just follow the code provided in the official site:

    How to export data to a XLSX file

    With this sample you'll be able to export to a XLSX file, but once you follow and understand the code of the next section, import, you'll be able to easlily modify and use XLS.

    Returning to export, with the example of the web you'll understand the procedure, or at least, you'll be able to modify it so you can export your own dataset.

    Here I want to say that in C# the backlash is an special character, so for the backlashes of the paths you have to use \\ instead of \. For example for this file:

    C:\MyFolder\My Workbook.xlsx

    you have to use:

    C:\\MyFolder\\My Workbook.xlsx

    4.- Why XLS and XLSX are different?

    XLSX files are really .zip compressed files that holds the information in XML format inside them, you can test it renaming a .xlsx file to .zip, you'll be able to unzip it and see this.

    XLS files are not that way, they're simplier in some way.

    Of course, the way to read/write them is different, NPOI supports both formats.

    5.- Import data from an excel file

    While export was a bed of roses for me, import data has been a challenge. I found many examples based on the own objets of the person that wrote it, in most cases the code of those objects was not published, so it was difficult to understand, many times I had a lot of parts of the code underlined in red and didn't had the code that declare those objetcs/functions, etc.

    After a lot of time looking for examples and trying to understand the code, I've write my own code, but it's not a code for me (no one will care my code), it's a code made to understand how it works, made to be able to modify and addapt to your needs. The code has the only purpose of make anyone understand how it works and be a base for the develop, I don't control if the file exist, I don't store the imported cells in any variable...

    It's, to sum up, the example I'd like to have found, and that could have save me from some hours of frustation.

    To use this code simply create a new class and call it NPOI_Test. Open the class you've create and paste this code:

     using System;
        using System.Windows;    
        using System.IO;
        using NPOI.HSSF.UserModel; //We must have this using here to be able to use HSSF usermodel. It handles XLS files
        using NPOI.SS.UserModel;   //We must have this using here for using NPIO (ICell, IRow, ISheet and IWorkbook are here)
        using NPOI.XSSF.UserModel; //We must have this using here to be able to use HSSF usermodel. It handles XLSX files   
        //We use this class simply to create our test function inside it.
        class NPOI_Test
            //This function called ReadExcel simply opens an excel file and show the content of sheet 0 with MessageBoxes.
            //We pass the path to the Excel (for example "C:\\ExcelTest.xlsx")
            //The function returns an string var that contains the error message in case something goes wrong.
            public static string ReadExcel(string FileName)
                string ErrorMessage = ""; //we declare the var that will contain the error message in case of error.
                string CellContentString=""; //We'll use this to store the string of the cells when we'll iterate thorought them
                //what this try - catch does is simply running all the code inside the try {}. If it's ok, the catch code is not used, but if
                //something goes wrong, we store the error code in the var ErrorMessage (it's done on catch {})
                    //On next line we create an object of FileStream type that we'll use to stream content from the actual file
                    //What using does here is running inside code, it is:
                    //      var ObjectFileStream = new FileStream(FileName, FileMode.Open, FileAccess.ReadWrite)
                    //QUESTION: ¿Why doing it with this using and inserting all code of the function inside it?
                    //ANSWER: this way when using ends (when reach its } ), the objetc is erased automatically releasing the used memory.
                    //We are using ReadWrite mode, but I guess that if we're not goint to write, perhaps we could use "Read" (perhaps more safe of efficient?)
                    using (var ObjectFileStream = new FileStream(FileName, FileMode.Open, FileAccess.ReadWrite))            
                        //We get the file extension (XLS or XLSX) and store in a new string variable called FileExtensionString
                        string FileExtensionString = System.IO.Path.GetExtension(FileName);
                        //IWorkbook is a object type provided by NPOI library. This line we create an instance of this object under the name "ObjectWorkBook"
                        //By now we only declare the variable of this kind, but we'll let it empty and assing nothing to it.
                        IWorkbook ObjectIWorkBook;
                        //We create a reference in memory for a variable of type IFormulaEvaluator. This is an object type provided by NPOI that allow us to
                        //Get calculated values on cells that have a formula. Whithout it, in a cell that contains "=A1+A2", we would get the value
                        //"A2 + A2" instead of the result of the operation of sum A1 + A2
                        IFormulaEvaluator ObjectIFormulaEvaluator; //NOTE: If you're sure your sheets have no formulas you can delete or comment this line
                        //Depending on the kind of excel file (XLS or XLSX) we create the ObjectWorkBook we've just declare 2 lines before, in a way or in other. 
                        //HSSF is used for XLS and XSSF for XLSX
                        if (FileExtensionString.Equals(".xls"))
                            //create the instance of our object ObjectWorkBook for XLS excel files
                            ObjectIWorkBook = new HSSFWorkbook(ObjectFileStream);
                            //create the instance of our object ObjectIFormulaEvaluator for XLS excel files
                            ObjectIFormulaEvaluator = new HSSFFormulaEvaluator(ObjectIWorkBook); //NOTE: If you're sure your sheets have no formulas you can delete or comment this line
                            //NOTE: On next line we're really using XSSF for any file extension different from XLS (XLS but also PNG, DOCX, TXT...) Perhaps you should control
                            //it's really a XLSX and not any other
                            //create the instance of our object ObjectWorkBook for XLSX excel files
                            ObjectIWorkBook = new XSSFWorkbook(ObjectFileStream);
                            //create the instance of our object ObjectIFormulaEvaluator for XLSX excel files
                            ObjectIFormulaEvaluator = new XSSFFormulaEvaluator(ObjectIWorkBook); //NOTE: If you're sure your sheets have no formulas you can delete or comment this line
                        //We insert this MessageBox to show how to know how many Sheets does book have
                        MessageBox.Show("Number of sheets this workboot contains: " + ObjectIWorkBook.NumberOfSheets);
                        //Once we've create the IWorkBook object, next step is creating the Sheet. We load into it the first sheet on the actual workbook (passing 0 on GetSheetAt)
                        //We cant open second sheet (use 1 instead of 0), third (use 2 instead of 0), and so
                        //To clear: Isheet is another object type provided by NPOI, just like IWorkBook, we call our object ObjectISheet
                        ISheet ObjectISheet = ObjectIWorkBook.GetSheetAt(0);                                             
                        //Once created this ObjectISheet, we can look at its name thanks to the property ObjectISheet.SheetName:
                        MessageBox.Show("The name of this sheet is: " + ObjectISheet.SheetName);
                        //with the aprropiate code we could iterate all sheets, look for the one that have the name we want, and when finding it,
                        //we could show/load its cells (in this example we're going to show the cells of the first sheet of the book)
                        //Thanks to this "for" we go thorough all the Rows in the Sheet, starting with 0, that corresponds to the row number one in the real sheet
                        //To clear: sheet.LastRowNum returns the last row number in the sheet
                        for (int CountRow = 0; CountRow <= ObjectISheet.LastRowNum; CountRow++)
                                //Below File --> Workbook --> sheet, next is Row (we go throught all rows in a sheet).
                                //We create a new object to store the Row we're analyzing now, it's done with the IRow object type provided by NPOI
                                //As yoy can see, we load the Row number CountRow, that is controlled by the for loop we're inside now.
                                IRow ObjectIRow = ObjectISheet.GetRow(CountRow);
                                //We create this int var called LastColumnInThisRowInt to store the last column in this Row.
                                int LastColumnInThisRowInt = ObjectIRow.LastCellNum -1; //Important: The minus 1 is becouse LastCellNum really gives the number of columns (1 to n), not the last one (0 to n)
                                //Until now we've create File --> Workbook --> sheet --> Row.
                                //Now it's time to go throrought all columns in this row. To do it we make a new for loop. We go from Column zero (A in the sheet)
                                //to last Column in this row
                                for (int CountCol = 0; CountCol <= LastColumnInThisRowInt; CountCol++)
                                    //Complete schema is File --> Workbook --> sheet --> Row --> Cell 
                                    //ICell is the Object that NPOI provides to handle the cells.
                                    //Declare ICell Object and assing the cell number "CountCol" from the row we're handling now:
                                    ICell ICellObject = ObjectIRow.GetCell(CountCol);
                                    //Now we want to know the content of the cell. we're going to store it in the string variable we created at the
                                    //beggining, CellContentString.
                                    //We can't simply take the value ICellObject.Text since we want to take correct values depending on the type of cell
                                    //The function GetFormattedCellValue receives:
                                    //      ICellObject: The cell we're analyzing
                                    //      the objetc ObjectIFormulaEvaluator we created with the aim of read correctly the results of formulas
                                    CellContentString = GetFormattedCellValue(ICellObject, ObjectIFormulaEvaluator);
                                    //NOTE: If you're sure you'll never have formulas in the sheets, you can make it run faster by disable this feature
                                    //for doing it you have to:
                                    // 1.- Replace this last line by next one, so ObjectIFormulaEvaluator gets null value on the function GetFormattedCellValue:
                                    //      CellContentString = GetFormattedCellValue(ICellObject); //(Line 103)
                                    // 2.- Delete or comment the three lines related (above here):
                                    //      IFormulaEvaluator ObjectIFormulaEvaluator; //(Line 36)
                                    //      ObjectIFormulaEvaluator = new HSSFFormulaEvaluator(ObjectIWorkBook); //(Line 50)
                                    //      ObjectIFormulaEvaluator = new XSSFFormulaEvaluator(ObjectIWorkBook); //(Line 59)
                                    //Finally we show a Message box with the column number, row number and cell content
                                    //Not to say you can modify this to store the values on your own array or any other object where you want to store
                                    //the content of the sheet
                                    MessageBox.Show("Column: " + CountCol + "\nRow: "  + CountRow + " \nContent: " + CellContentString + "\n\nCell Format: " + ICellObject.CellType);                            
                            catch (Exception exinsidetheloop)
                                //In case we've an error processing cells:
                                ErrorMessage = "Error processing cells: " + exinsidetheloop.Message;
                catch (Exception ex)
                    //We copy the error cod on the variable so we can return it
                    ErrorMessage = ex.Message;
                //return the content of ErrorMessage so we can control when a error happends
                return ErrorMessage;            
        //this function is used by ReadExcel and return the value of the cell once verified to return correct value
        //It receives:
        //      ICellObject: The cell we're analyzing
        //      the objetc ObjectIFormulaEvaluator we created with the aim of read correctly the results of formulas
        private static string GetFormattedCellValue(ICell cell, IFormulaEvaluator ObjectIFormulaEvaluator = null)
            //if Cell is null we'll return an empty string
            if (cell != null)
                //we've to decide what to do depending of the type of cell
                switch (cell.CellType)
                    case CellType.String:
                        return cell.StringCellValue; //simply return the value in string of the cell
                    case CellType.Numeric:
                        //If the cell is numeric but holds a date, we convert the date to return correctly
                        if (DateUtil.IsCellDateFormatted(cell))
                            DateTime date = cell.DateCellValue;
                            ICellStyle style = cell.CellStyle;
                            // Excel uses lowercase m for month whereas .Net uses uppercase
                            string format = style.GetDataFormatString().Replace('m', 'M');
                            return date.ToString(format);
                        //If cell is numeric and not a date, simply return the numeric value
                            return cell.NumericCellValue.ToString();
                    case CellType.Boolean:
                        //for bool cells we return TRUE or FALSE
                        return cell.BooleanCellValue ? "TRUE" : "FALSE";
                    case CellType.Formula:
                        if (ObjectIFormulaEvaluator != null)
                            //When eval is not null, we call this same function again passing to get the actual value of the cell instead of
                            //the formula, we do it passing ObjectIFormulaEvaluator.EvaluateInCell(cell) instead of cell/ICellObjetc
                            //So, with the next line we make that in a cell with "=A1 + A2" we get result of the formula.
                            return GetFormattedCellValue(ObjectIFormulaEvaluator.EvaluateInCell(cell));
                            //if ObjectIFormulaEvaluator is null we return cell.CellFormula.
                            return cell.CellFormula;
                    case CellType.Error:
                        //we return the code of the error when cell is error.
                        return FormulaError.ForInt(cell.ErrorCellValue).String;
            //null or blank cell, or unknown cell type, we return blank string
            return string.Empty;
    Once created the class simply add this code to the function where you want to call it (for example in the on_click of a button):

    string ErrorCode = NPOI_Test.ReadExcel("T:\\book1.xlsx");
    Replace book1.xlsx with the path of the file to test.

    If you follow the code and read comments you'll understand what it does and you'll be able to modify to meet your needs.

    6.- Are you new in C# and .NET?

    I take advantage of this post to recommend you this great and free course:

    Home And Learn - Visual C# .NET

    It's quite basic but has heped to me to remember some objetc-oriented programming bases I had almost forgot. Again, thanks to jmcilhinney.

    Last edited by TassadarNET; Mar 29th, 2019 at 06:47 AM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts


Click Here to Expand Forum to Full Width