NYCPHP Meetup

NYPHP.org

[nycphp-talk] CSV Regex

Adam Fields adam at digitalpulp.com
Fri Feb 27 17:17:58 EST 2004


Hans Zaunere wrote:

> Hi all,
> 
> I'm looking to parse a CSV file and figured there must be a smooth regex
> for the job.  I'm of course getting the data line by line.  Google seems
> to think this is everyone's favorite regex for the job:
> 
> ,(?=([^"]*"[^"]*")*(?![^"]*"))
> 
> And, I must say, it works pretty good, using, for instance, the
> following command:
> 
>       $columns =
> preg_split('/,(?=([^"]*"[^"]*")*(?![^"]*"))/',fgets($fp));
> 
> The only issue is it returns the double quotes with each column.  So, if
> a column contains:
> 
> "D"
> 
> I get "D" back.  I'd prefer just D
> 
> Any tips?  Thanks,

You could just strip off the quotes afterwards, of course, but that's 
maybe not what you were looking for.

Dealing with quotes in CSV is sort of a disaster waiting to happen. 
Remember that there may be quotes embedded inside fields, which are 
themselves quoted, and newlines inside fields may or may not be supported.

There's discussion of this in The Art of Unix Programming:

http://www.faqs.org/docs/artu/ch05s02.html#id2901882



More information about the talk mailing list