Re: CSV import suggestions
This WebDNA talk-list message is from 2007
It keeps the original formatting.
numero = 69173
interpreted = N
texte = Exactly the problem I am facing with outside vendors. I need a method thatwould convert csv formats that could potentially vary. I have screamed andscreamed for tab delimited and I just GIVE up at this point!!! Send me thecode to aplusmiami@gmail.comVince-----Original Message-----From: WebDNA Talk [mailto:WebDNA-Talk@talk.smithmicro.com] On Behalf OfMatthew A PerosiSent: Friday, July 06, 2007 12:50 AMTo: WebDNA TalkSubject: Re: CSV import suggestionsI saw all the contributions to this today and just didn't have time to add in my 2 cents until now. I'll explain what I do and if anyone wants to see my code I'll be glad to post it.I have several websites that needed CSV import routines. In my application I am dealing with vendors that are uploading product inventory. My online database is standardized, but the CSV files the vendors upload is different from vendor to vendor so every vendor has a slightly different CSV field to db mapping. What I found was that now matter how many requests (or demands) I made, the vendors were unreliable with the format of their data format. What I end up with are CSV files that indeed do have commas separating every cell, however not every cell was encapsulated in quotes. So I had to create a totally foolproof CSV import routine. It's very convoluted, but so far it's running since April 2006 without bugs.Here are the steps I take using an uploaded CSV file called export.txt, and a temp db for processing is called import.db.1. Upload export.txt2. closedatabase import.db3. delete all records in import.db4. rename export.txt to export.inc5. read import.inc into a single variable called [import]6. url wrap [import]7. convert any pipes (%7C) into dash (-) characters... yes that seems strange, keep reading8. convert carriage returns (%0D%0A) into the pipe (|) character because that will tell us where record ends.9. convert & (%26) into dash (-) because that also tends to mess things up10. Now strip out one record at a time from the [import] variable using [listwords Words=[import]&Delimiters=|]11. Now step (filter) through every character of every stripped out record and make sure that every comma is converted into a TAB, unless the comma is between 2 quotes, in which case we keep the comma and erase the quotes.12. After applying this convoluted filter from step 11 we are left with what looks like a normal WebDNA TAB db record. I then run [listwords words=[url][currentrecord][/url]&Tabs=T] and map every cell to the predetermined db cells for that vendor. I use the [index] within [listwords] to control the mapping and [replace] every cell into the import.db.13. After all the data is sanitized through the above 12 steps I am left with a pure WebDNA db that I can do anything to. However, at that point I now validate every record against several additional predetermined import filters. If any record has incorrectly formatted data, the entire record is flagged for the vendor to manually check/correct.14. After all the error checking from step 13 is completed the import.db I delete the vendor's entire live set of products and everything in import.db is loaded into the live products.dbAs I said, the vendors uploading the CSV files were not creating standard CSVs in this format:"1","2","3","4"So I could not rely on the built in ability for WebDNA to read that type of file. The above explanation does represent several weeks of work, and it works perfectly for my needs. I have since created another import routine when I'm sure the export.txt file is already TAB delimited, and another version when I am guaranteed not to have a comma in any cell between quotes.Hopefully the ideas can help you in the right direction. If anyone wants to see the code let me know and I'll be glad to provide everything because this is a useful tool in my bag of tricks.-----------------Matthew A PerosiPsi Prime, Inc.323 Union BlvdTotowa, NJ 07512Phone: 973-413-8210Fax: 973-413-8217Vincent Medina wrote:> Anyone have any suggestions for me. I need to import a CSV file into a> webdna db. >> Vincent Medina> APCN>>>> -------------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list
.> To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to> Web Archive of this list is at: http://webdna.smithmicro.com/>>> -------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail toWeb Archive of this list is at: http://webdna.smithmicro.com/-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/
Associated Messages, from the most recent to the oldest:
Exactly the problem I am facing with outside vendors. I need a method thatwould convert csv formats that could potentially vary. I have screamed andscreamed for tab delimited and I just GIVE up at this point!!! Send me thecode to aplusmiami@gmail.comVince-----Original Message-----From: WebDNA Talk [mailto:WebDNA-Talk@talk.smithmicro.com] On Behalf OfMatthew A PerosiSent: Friday, July 06, 2007 12:50 AMTo: WebDNA TalkSubject: Re: CSV import suggestionsI saw all the contributions to this today and just didn't have time to add in my 2 cents until now. I'll explain what I do and if anyone wants to see my code I'll be glad to post it.I have several websites that needed CSV import routines. In my application I am dealing with vendors that are uploading product inventory. My online database is standardized, but the CSV files the vendors upload is different from vendor to vendor so every vendor has a slightly different CSV field to db mapping. What I found was that now matter how many requests (or demands) I made, the vendors were unreliable with the format of their data format. What I end up with are CSV files that indeed do have commas separating every cell, however not every cell was encapsulated in quotes. So I had to create a totally foolproof CSV import routine. It's very convoluted, but so far it's running since April 2006 without bugs.Here are the steps I take using an uploaded CSV file called export.txt, and a temp db for processing is called import.db.1. Upload export.txt2. closedatabase import.db3. delete all records in import.db4. rename export.txt to export.inc5. read import.inc into a single variable called [import]6. url wrap [import]7. convert any pipes (%7C) into dash (-) characters... yes that seems strange, keep reading8. convert carriage returns (%0D%0A) into the pipe (|) character because that will tell us where record ends.9. convert & (%26) into dash (-) because that also tends to mess things up10. Now strip out one record at a time from the [import] variable using [listwords Words=[import]&Delimiters=|]11. Now step (filter) through every character of every stripped out record and make sure that every comma is converted into a TAB, unless the comma is between 2 quotes, in which case we keep the comma and erase the quotes.12. After applying this convoluted filter from step 11 we are left with what looks like a normal WebDNA TAB db record. I then run [listwords words=[url][currentrecord][/url]&Tabs=T] and map every cell to the predetermined db cells for that vendor. I use the [index] within [listwords] to control the mapping and [replace] every cell into the import.db.13. After all the data is sanitized through the above 12 steps I am left with a pure WebDNA db that I can do anything to. However, at that point I now validate every record against several additional predetermined import filters. If any record has incorrectly formatted data, the entire record is flagged for the vendor to manually check/correct.14. After all the error checking from step 13 is completed the import.db I delete the vendor's entire live set of products and everything in import.db is loaded into the live products.dbAs I said, the vendors uploading the CSV files were not creating standard CSVs in this format:"1","2","3","4"So I could not rely on the built in ability for WebDNA to read that type of file. The above explanation does represent several weeks of work, and it works perfectly for my needs. I have since created another import routine when I'm sure the export.txt file is already TAB delimited, and another version when I am guaranteed not to have a comma in any cell between quotes.Hopefully the ideas can help you in the right direction. If anyone wants to see the code let me know and I'll be glad to provide everything because this is a useful tool in my bag of tricks.-----------------Matthew A PerosiPsi Prime, Inc.323 Union BlvdTotowa, NJ 07512Phone: 973-413-8210Fax: 973-413-8217Vincent Medina wrote:> Anyone have any suggestions for me. I need to import a CSV file into a> webdna db. >> Vincent Medina> APCN>>>> -------------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list .> To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to> Web Archive of this list is at: http://webdna.smithmicro.com/>>> -------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail toWeb Archive of this list is at: http://webdna.smithmicro.com/-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/
"Vincent Medina"
DOWNLOAD WEBDNA NOW!
Top Articles:
Talk List
The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...
Related Readings:
WebCatalog 2.0 b 15 mac (1997)
'page impression' techniques for banner ads (1999)
select multiple 2 more cents (1997)
Setting up shop (1997)
WC Database Format (1997)
errormessages.db (1997)
Locking up with WebCatalog... (1997)
OT fullfillment centers / service (2000)
Loss in form (1998)
CommandSecurity? (1997)
Country & Ship-to address & other fields ? (1997)
credit card services (1997)
session handling (2004)
Duplicate Messages (1998)
Re(2): Re(2): SSL, WebSTAR, WebCatalog (1998)
Sending E-mail (1997)
WebCat2.0 [format thousands .0f] no go (1997)
Error Lob.db records error message not name (1997)
search for all (1998)
Re2: frames & carts (1997)