Hey everyone, I am new to the forums and kind of hope to grow and get some insight from everyone here. I am having a little bit of trouble I would really love some some kind of help with something small I have been writing in Visual Studios 2010.
At my job, we deal with a lot of excel files, mainly CSV but others as well. We import these CSV files into our databases at work and we can only import 1000 lines at a time. I have already created something that can split a header line and 999 lines to equal 1000 lines. So now I am having a problem by simply implementing a simple Replace string and delete row. So what I have so far is:
Code:
Sub SplitCSV(ByVal FilePath As String, ByVal LineCount As Integer, ByVal MaxOutputFile As Integer, ByVal Status As UpdateProgressSub, ByRef IsAbort As Boolean)
' Validate first
If LineCount < 100 Then Throw New Exception("Number of lines must be more than 100.")
' Open the csv file for reading
Dim Reader As New IO.StreamReader(FilePath)
' Create the output directory
Dim OutputFolder As String = FilePath & "_Splits"
If Directory.Exists(FilePath) = False Then
Directory.CreateDirectory(OutputFolder)
End If
' Read the csv column's header
Dim strHeader As String = Reader.ReadLine
' Start splitting
Dim FileIndex As Integer
Do
' Update progress
FileIndex += 1
If Not Status Is Nothing Then
Status.Invoke((FileIndex - 1) * LineCount)
End If
' Check if the number of splitted files doesn't exceed the limit
If (MaxOutputFile < FileIndex) And (MaxOutputFile > 0) Then Exit Do
' Create new file to store a piece of the csv file
Dim PiecePath As String = OutputFolder & "\" & Path.GetFileNameWithoutExtension(FilePath) & "_" & FileIndex & Path.GetExtension(FilePath)
Dim Writer As New StreamWriter(PiecePath, False)
Writer.AutoFlush = False
Writer.WriteLine(strHeader)
' Read and writes precise number of rows
For i As Integer = 1 To LineCount
Dim s As String = Reader.ReadLine()
If s <> Nothing And _IsAbort = False Then
Writer.WriteLine(s)
Else
Writer.Flush()
Writer.Close()
Exit Do
End If
Next
' Flush and close the splitted file
Writer.Flush()
Writer.Close()
Loop
Reader.Close()
End Sub
(Some of my code is missing, mostly my visual form data)
What I need is to somehow replace invalid characters within my CSV file for example = , ' " .
What I also need is to somehow delete a entire row if a cell is empty in Column A,B,C
(Then I need that row to shift up, for example: Row 16 has a empty cell in Column B so the row needs to be deleted and Row 17+ that is correct shifts up)
I can already tell this is most likely a simple process and somehow its just blowing through me. I am still a beginner and in school in the learning process. This is also a separate program, not a Macro in Excel itself, basically trying to make this a one click process.
this is easy for me to do in vb6 /vba, but i do not know how much change is needed for .net
what do you want to replace the invalid characters with?
i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case. Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next
dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part
come back and mark your original post as resolved if your problem is fixed
pete
Thanks for the reply I really need some help lol, umm basically I need it to search the entire Excel file and remove the invalid characters with nothing or a blank space. Do you think the way I did it is inefficient? My biggest ordeal though os having a check for Column A,B,C if a cell is empty to remove the entire row and shift all other up. I am beginning to think its not possible. Help is appreciated, wondering if I need to show my entire source.
Like I mentioned in my OP the split works great and fast, I just can't seem to figure out how to implement these two features I really need.
blanks = false ' remember this is vb6 /vba code modify to .net
lnarr = split(s, ",") ' split line into fields
for f = 0 to 2 ' test first 3 fields
if trim(lnarr(f)) = "" then blanks = true: exit for
next
if not blanks then
Writer.WriteLine(s)
else
i = i - 1 ' deduct this line from line count
end if
Else
Writer.Flush()
Writer.Close()
Exit Do
End If
i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case. Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next
dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part
come back and mark your original post as resolved if your problem is fixed
pete
So what do I do to declare what "blanks" "lnarr""f" is?
i thought straight forward
blanks as boolean
lnarr() as string , not sure about arrays in .net
f as integer
i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case. Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next
dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part
come back and mark your original post as resolved if your problem is fixed
pete
I apologize westconn1 I am still highly inexperienced and I only know the basics. I thought I got it to work but it was splitting my files empty into the output folder. I am not sure if I am allowed to post my source but I attached it if you don't mind just taking a look at it. Any time is appreciated.
Once again sorry, I am probably in over my head but making this tool will make my life easier.
i looked at your code and could not see any obvious problem, though note, i repeat, i am not familiar with .net coding, i do not see where you might have tried to incorporate the code i suggested, so i do not know what you have tried.
if you are lucky, robdog, koolsid or someone with .net experience may come on line and be able to help you
i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case. Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next
dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part
come back and mark your original post as resolved if your problem is fixed
pete
Thanks for taking a look at it westconn, you have been a lot of help so far. I guess all I can do is more trial and error and reading. Do you think it can be done another way by chance?
there are nearly always multiple solutions, for any task
another method might be to automate excel
i could see no code to indicate what you had tried to read the first 3 columns
i did make the assumption that your .csv file actually had comma separated values (csv), which like all assumptions may be incorrect, if your fields are separated by some other identifier then you would need to change the code to suit
google on .net string functions to find the equivalents of what i suggested
i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case. Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next
dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part
come back and mark your original post as resolved if your problem is fixed
pete
Yea, the file is using commas to separate, I dont need to replace them ugh I didn't realize that I had that to replace invalid characters. (In my OP)
My problem is everything I try, I cant get a replace or remove row function/code to work. I never get any errors either.
I never had anything in my original code to read the first 3 columns. The reason I don't want to use macros is because I don't have access to Excel all the time. So if I can automate this process removing invalid characters and rows that are missing data. What would be amazing is being able to generate a log file to tell me what rows were removed. I just don't have the apptitude or knowledge to do any of this. :\
Last edited by Huntk22; Nov 21st, 2011 at 04:35 PM.