Linux forum

General discussion

Grep & AWK to separate fields not in order or missing data

by sameohoster / June 14, 2009 10:18 PM PDT

I have a massive log file from unix and checkpoint based systems in Excess of 50 Gigabytes and wanting to analyse them with a script and strugging to break fields where some data is missing. Any help will be grand.

I can use awk perfectly if the fields are aligned correctly and separated with a single field separator. The command used for this purpose is cat $file | awk -F; '{print $1 $2 $3 $4 $5}' and works perfectly.

The problem is when fields are not aligned and are not using fixed width. I can manage to include the fieldname just before data when logging but still can't understand how I can separate them and process.

Two examples below would explain the data I want to process with a missing date on record 3. The spaces added to fields are for sample only and not part of original data.

Sr#; Name; Joining Date; Department; Salary
001 Bill Gates; 01/01/1980; Microsoft; $100,000,00
002 Jermey Gourge; 01/01/1990; Unix; $100,000
003 George Shrinks; Virtualisation; $90,000
004 Sam Hunter; 01/01/1999; Games; $10,000

sr:001 name:Bill Gates; date:01/01/1980; dept:Microsoft; salary:$100,000,00
sr:002 name:Jermey Gourge; date:01/01/1990; dept:Unix; salary:$100,000
sr:003 name:George Shrinks; dept:Virtualisation; salary:$90,000
sr:004 name:Sam Hunter; date:01/01/1999; dept:Games; salary:$10,000

I have also attempted to use grep which may work on small data but not on 50GB file and may take weeks to complete.

Any suggestions?

Sameo Hoster

Discussion is locked
You are posting a reply to: Grep & AWK to separate fields not in order or missing data
The posting of advertisements, profanity, or personal attacks is prohibited. Please refer to our CNET Forums policies for details. All submitted content is subject to our Terms of Use.
Track this discussion and email me when there are updates

If you're asking for technical help, please be sure to include all your system info, including operating system, model number, and any other specifics related to the problem. Also please exercise your best judgment when posting in the forums--revealing personal information such as your e-mail address, telephone number, and address is not recommended.

You are reporting the following post: Grep & AWK to separate fields not in order or missing data
This post has been flagged and will be reviewed by our staff. Thank you for helping us maintain CNET's great community.
Sorry, there was a problem flagging this post. Please try again now or at a later time.
If you believe this post is offensive or violates the CNET Forums' Usage policies, you can report it below (this will not automatically remove the post). Once reported, our moderators will be notified and the post will be reviewed.
Popular Forums
Computer Newbies 10,686 discussions
Computer Help 54,365 discussions
Laptops 21,181 discussions
Networking & Wireless 16,313 discussions
Phones 17,137 discussions
Security 31,287 discussions
TVs & Home Theaters 22,101 discussions
Windows 7 8,164 discussions
Windows 10 2,657 discussions


Help, my PC with Windows 10 won't shut down properly

Since upgrading to Windows 10 my computer won't shut down properly. I use the menu button shutdown and the screen goes blank, but the system does not fully shut down. The only way to get it to shut down is to hold the physical power button down till it shuts down. Any suggestions?