Regex - Reading File Large File Very Slow?
Aug 26, 2011this code takes about 30 mins and high cpu usage, what is the problem
Do
strLine = objReader.ReadLine()
If strLine Is Nothing Then
[code].....
this code takes about 30 mins and high cpu usage, what is the problem
Do
strLine = objReader.ReadLine()
If strLine Is Nothing Then
[code].....
this code takes about 60 mins and high cpu usage for a text file of 90,000 lines, what is the problem..[code]
View 9 RepliesMy problem is I have very large text files (approx 2GBs+).They have records in them based in one per line.Each line is not the same length and the data can be different lengths all the time.I am currently reading the file line by line, then splitting the data by common characters in the records. To process the full file it currently takes 3hours. This is way too slow for its purpose.
View 4 RepliesI've a problem reading text file using StreamReader. The file have between 500 000 and 1 000 000 lines.When I try to read it in a cycle, I get an error. That's why I've tried the StreamReader.ReadToEnd method. It worked fine. I've get the entire contents of the file in one string. So far everything is okay, but I've a small problem searching this huge string. I have to reformat the string to my desired format. I'll try to be more specific: The format of the input file is as follows:
50471100 8 2 6 5 0<LF><CR>
00000016 365442 12231<LF><CR>
00000026 112166 31133<LF><CR>
<end>
[Code]...
The following code suppose to:
1. Read line-by-line a txt file with more than 500,000 lines, (each line 521 characters long)
2. extract an ID No from the line
3. query from a database for LCCIStatus
4. concatenate the value of LCCIStatus to the line
5. write the line to sample.txt
My problem is, this code works perfectly with the test file of 8000 lines but fail with the actual files which have over 500,000 lines. FYI, the test file contains data which I cut and paste from the actual file.
[Code]...
I wrote a cleanup program to go through some directories and delete files based on if the creation date is older than say 6 months. It works fine with some of the directories I have that contain around a few thousand small files. However, there is one directory (that contains small backup files from another program) that is loaded with over 300,000 files and it locks up on me as soon as I read in the first file in that directory.I am convinced it is the directory has too many files in it to open it. The server that the directory is on is slow. It takes a half hour to open the directory while on the server itself. I know it will take forever to delete the amount of files I want to delete, but I don't understand why it gets stuck and hangs there with no error message.Here is where I get stuck. Listbox1 is the directory I'm attempting to access
For Each selectFile In My.Computer.FileSystem.GetFiles(ListBox1.Items.Item(Count), FileIO.SearchOption.SearchTopLevelOnly, "*.*")
compFile = Path.GetFileName(selectFile)
[code].....
i need to read a large file stream binary my code is Dim sr As New IO.FileStream(srcFile,
[Code]..
I'm trying to read a binary file from a FileMaker 11 container field using Filemaker's own ODBC driver. I was able to write files to the database and this works fine. retrieving them manually works fine and the files look OK and are not corupted. However when retreiving them using VB.NET, and if the file size is approx > 5MB, I get the following "uncatchable" error (yes thats right, I cant "Try Catch End Try", it just crashes):
System.AccessViolationException was unhandled
Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
[code].....
I need to read a Visual Studio Solution file (.SLN files) to figure out which projects belong to this solution.By opening a SLN file in notepad, one can see that the projects are stored like this:
Microsoft Visual Studio Solution File, Format Version 11.00
# Visual Studio 2010
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "RadioGadget", "RadioGadgetRadioGadget.csproj", "{AE664C6C-000B-44D9-BAD8-0200D6ABE90D}"
[code].....
excluding the ... part obviously (which is just some guid I don't care about). Then I need it to match the rest too, and more importantly, I only need it to return the filename that I've underlined in the example. When I simply use 'm.Value' as I am now I think I will get the entire matched string back, right? That gets me nowhere as I'd still need to parse the filename out of that manually... Bit pointless to use regex then...
I have a data file which I am de-compressing and then reading line by line.This includes data which is then read by my function which is split into sperate bits then inserted into a database.My reader is currently taking ages. (largest file being 1.8GB)I am using:
Code:
' File exists, read file.
Dim objReader As New System.IO.StreamReader(fileName)[code]....
Is there a quicker way to do this? And possibly a progress bar to show how far it is through the file?
I need to write data to a file, preferably in binary format, but I am unaccustomed with the concept. Where's the easiest place to get the basics? I could come here with a specific need, but I'm at the point right now where I am more willing to work within the confines of keeping it simple.
Here's what I know:
1. how to open a new file
2. how to specify the record length
3. how to close the file
Some specific questions:
Does the record length have to be constant throughout the file?
Can I read the nth record without reading the whole file?
I want to be able to able to read the dimensions of a TIF image without loading the entire file.
PS Using Visual Basic 2008 on Vista64.
I wrote last week about a problem with an MDI program that had a large number of forms (each with a large number of controls on it) that was "sluggish" in loading and in switching between the child forms.I've attached an example program in VS2008 (though the actual app is VB2010)Rather than show the hardware control application with all of the text fields, and picture boxes acting as indicators, I made a simple program to show the point. This is exaggerated as this just loads 2000 or so text boxes on a form. In the real app I have ~ 200 assorted controls per form of picture boxes, scroll bars, text boxes, labels etc. Also, there is a large full screen .jpeg as the back ground of each form. All are generated at run time and are placed on the form in the New call of each form. (as in the sample attached)main issue seems to be the methodology I use to switch between child forms. I make the current form .visible = false, and the next.visible = true. I have used this as it "keeps the place" on each page if the user has scrolled or is looking at one section of the form. When the next form .visible = true happens I see the controls added in a "machine gun" fashion instead of all at once.
View 28 RepliesI'm working on a larger VB application (framework v3.5) where the compile time continues to get slower and slower the larger it grows. It currently takes about 7 minutes to compile just the extensions project. We have other similar projects in C# that don't experience the slow compile time.
View 1 RepliesI just converted my project from VS 2005 to 2008 and started getting really bad lag only when intellisense iterates my dataset objects. The cpu maxes out one of my cores for about 2-3 seconds each time the intellisense list pops up. Is this a new feature of the 2008 ide?
View 19 RepliesI am new to VB.Net, but I can tell you so far I love programming. That said, I'm building a tool basically to parse and display simple plain text log files. I'm hitting one stumbling block that really has me frustrated.
Other tools are able to load huge log files (500MB even) in a number of seconds. My tool, basically hangs loading a file that is maybe 5MB.
I'm using the MyString = StreamReader.ReadToEnd to read the contents into a string, and then RichTextBox1.Text = MyString to display the contents. That said, I really want to display the contents in a datagrid, but there has to be a better way of doing this?
How can I get my application to load larger files and display them faster? What am I doing wrong?
is there a way to increase the rate a Process object in .NET throws the OutputDataReceived event? It creates a large buffer (I believe 1024 characters) that is dumped in bulk which makes the application not as fluid of a stream as I would of like it to be.
View 1 RepliesI want to perform two hashing operations concurrently on a single file, without reading the file twice. Is there some way to share the FileStream between two synchronized hashing threads? For example:
Dim Stream As New IO.FileStream("...", IO.FileMode.Open)
Dim HashA, HashB As Byte()
Dim A = New System.Threading.Thread(Sub()
[code]....
Maybe some way of caching a stream? Trouble is, I don't want the entire file in memory at once (it could be many gigabytes in size) and I don't want the file read more than once, due to speed issues. I want the file cached only sufficiently to ensure that both threads can work.For example, suppose at some point in time, thread A had read 100k of the file, but thread B had only read 20k of the file. The portion of the file between 20k and 100k should be cached; but then progressively forgotten as thread B catches up. Then again, if thread B is too slow, we might still end up with hundreds of megabytes being cached.Maybe thread A should be made to wait while thread B catches up. Maybe the ReadByte() function of the stream should block for a while if one of the threads is too far behind?
I have a checked list box that is populated with the text from a text file. I started off with this code:
Dim FileToLoad As String
FileToLoad = TextBox3.Text
Dim fs As FileStream = New FileStream(FileToLoad, FileMode.Open)
[code].....
friend of mine has no HTML knowledge so I'm attempting to write a program that replaces certain parts of a html file to suit his needs.I've edited the HTML file and marked certain parts with "tags" like this:
#IMAGEURL1# I have a textbox where he can copy + paste an image URL and hopefully the #IMAGEURL1# is replaced with the contents of the textbox.
So can someone please enlighten me to-as how to open a HTML file (there's no textbox to display the contents just yet, I'll add one if needed), find the specified text and then replace it with whats in a textbox.
I am trying to import a TAB (NOt comma) delimited text file into a DataGridView. The following code works fine if I have a comma separated file. All I have to do is change the FMT to "Delimited".It just does not work with FMT=TabDelimited. All columns are read into single datatable column. The text file is ANSI text and I have double checked to make sure Tabs are tabs and not spaces, even exported a sample Tab Delimited file from Excel.Can this even be done using Text Driver? [code]
View 2 RepliesHow do I play a WAV file while the computer is reading aloud a text file? It uses Text to Speech synthesis and I need a laughter wav to play when the computer comes across something funny in the line.
View 2 RepliesI'm new to Visual Basic. I'm trying to get this code to read a .txt file line by line. If the only thing that the line says is "B" it should add one to the intTotalBoys integer and so on with G for Girls, F for Fathers, and M for Mothers. I'm not sure why it won't work. [code]
View 4 RepliesI would like to read from file, and ignore lines that start with --. I know how to read line by line but i just need to ignore those lines.
View 3 RepliesSo I need to write out an object to a text file as well as read in objects from text files.How do I accomplish this? This is the code i've used to read and write just simple lines of text. Is there a small modification to this or just a different function i use to read in an entire object?
Code:
Dim path As String
path = "Security.txt"
[code].....
I have just recently been using VB 2010 after using VB5. I have noticed a lot of changes. The problem I have is that I wish to open and save text files to and from arrays in the background. I've attached what I would do in VB5. I have searched around, but all the examples I find use a Textbox instead of an array. Can anyone show me how I can do this with VB 2010?
View 1 RepliesI got a log file (Log.txt) a timer and a RichTextBox1.Text The timer is opening the log file again and again every milisecond here the code:
Code:
Okay that keeps the track of my log file inside my richtextbox , but there is a big problem there. My richtextbox slows down hard if the log file gets too big.. Sometimes it kicks me out of the program and in taskbar it says Program Not Responding..
I wanted to know if there is a method to read a bigger Log file with out having such issues. (It mostly happens when log file goes over 13000 characters)
I have pretty large file names that follow a standard naming practice and I am attempting to write a RegEx to match them. When you are reading the file name it follows the convention:
p_d(set of 8 numeric)_t(set of 8numeric)images.ext(any# of alphanumeric)
For example:
p_d12345678_t12345678_images.ext_0
The few I attempted brought back unrecognized escape sequence OR no matches found. For example this one brings back no matches:
fileInfo = value
'that changes depending on what file I'm looking for
RegEx("p_dd*_t" & fileInfo & "d*_images\.ext_w*")
I noticed that my application is running a bit slow. This is a Windows Service built using VB2010 with SQL Server as a back end. Its main function is to poll a folder looking for a text file. If found it reads the text file and imports the data into SQL Server.
While the slowness could be caused by a number of reasons, I am looking at my code to determine if it could be more efficient. Below I've pasted snippets of code where I initialize the DB connection and execute SQL statements. I've also included a snippet that illustrates how I am processing the text file.
I'd appreciate if the group could have a look at these and let me know if the method that I am using is the most efficient possible. At this time I am also focusing on other probable issues, but I can't rule out the possibility that the code that is in place might be a contributing factor.
[Code]...
We have an automatic process that opens a template excel file, writes rows of data, and returns the file to the user. This process is usually fast, however I was recently asked to add a summary page with some Excel formulas to one of the templates, and now the process takes forever.
It successfully runs with about 5 records after a few minutes, however this week's record set is almost 400 rows and the longest I've let it run is about half an hour before cancelling it. Without the formulas, it only takes a few seconds to run.
Is there any known issues with writing rows to an Excel file that contains formulas? Or is there a way to tell Excel not to evaluate formulas until the file is opened by a user?
The formulas on the summary Sheet are these:
' Returns count of cells in column where data = Y
=COUNTIF(Sheet1!J15:Sheet1!J10000, "Y")
=COUNTIF(Sheet1!F15:Sheet1!F10000, "Y")
[Code].....