Re: CSV import suggestions

This WebDNA talk-list message is from

2007


It keeps the original formatting.
numero = 69173
interpreted = N
texte = Exactly the problem I am facing with outside vendors. I need a method that would convert csv formats that could potentially vary. I have screamed and screamed for tab delimited and I just GIVE up at this point!!! Send me the code to aplusmiami@gmail.com Vince -----Original Message----- From: WebDNA Talk [mailto:WebDNA-Talk@talk.smithmicro.com] On Behalf Of Matthew A Perosi Sent: Friday, July 06, 2007 12:50 AM To: WebDNA Talk Subject: Re: CSV import suggestions I saw all the contributions to this today and just didn't have time to add in my 2 cents until now. I'll explain what I do and if anyone wants to see my code I'll be glad to post it. I have several websites that needed CSV import routines. In my application I am dealing with vendors that are uploading product inventory. My online database is standardized, but the CSV files the vendors upload is different from vendor to vendor so every vendor has a slightly different CSV field to db mapping. What I found was that now matter how many requests (or demands) I made, the vendors were unreliable with the format of their data format. What I end up with are CSV files that indeed do have commas separating every cell, however not every cell was encapsulated in quotes. So I had to create a totally foolproof CSV import routine. It's very convoluted, but so far it's running since April 2006 without bugs. Here are the steps I take using an uploaded CSV file called export.txt, and a temp db for processing is called import.db. 1. Upload export.txt 2. closedatabase import.db 3. delete all records in import.db 4. rename export.txt to export.inc 5. read import.inc into a single variable called [import] 6. url wrap [import] 7. convert any pipes (%7C) into dash (-) characters... yes that seems strange, keep reading 8. convert carriage returns (%0D%0A) into the pipe (|) character because that will tell us where record ends. 9. convert & (%26) into dash (-) because that also tends to mess things up 10. Now strip out one record at a time from the [import] variable using [listwords Words=[import]&Delimiters=|] 11. Now step (filter) through every character of every stripped out record and make sure that every comma is converted into a TAB, unless the comma is between 2 quotes, in which case we keep the comma and erase the quotes. 12. After applying this convoluted filter from step 11 we are left with what looks like a normal WebDNA TAB db record. I then run [listwords words=[url][currentrecord][/url]&Tabs=T] and map every cell to the predetermined db cells for that vendor. I use the [index] within [listwords] to control the mapping and [replace] every cell into the import.db. 13. After all the data is sanitized through the above 12 steps I am left with a pure WebDNA db that I can do anything to. However, at that point I now validate every record against several additional predetermined import filters. If any record has incorrectly formatted data, the entire record is flagged for the vendor to manually check/correct. 14. After all the error checking from step 13 is completed the import.db I delete the vendor's entire live set of products and everything in import.db is loaded into the live products.db As I said, the vendors uploading the CSV files were not creating standard CSVs in this format: "1","2","3","4" So I could not rely on the built in ability for WebDNA to read that type of file. The above explanation does represent several weeks of work, and it works perfectly for my needs. I have since created another import routine when I'm sure the export.txt file is already TAB delimited, and another version when I am guaranteed not to have a comma in any cell between quotes. Hopefully the ideas can help you in the right direction. If anyone wants to see the code let me know and I'll be glad to provide everything because this is a useful tool in my bag of tricks. ----------------- Matthew A Perosi Psi Prime, Inc. 323 Union Blvd Totowa, NJ 07512 Phone: 973-413-8210 Fax: 973-413-8217 Vincent Medina wrote: > Anyone have any suggestions for me. I need to import a CSV file into a > webdna db. > > Vincent Medina > APCN > > > > ------------------------------------------------------------- > This message is sent to you because you are subscribed to > the mailing list . > To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to > Web Archive of this list is at: http://webdna.smithmicro.com/ > > > ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/ ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/ Associated Messages, from the most recent to the oldest:

    
  1. Re: CSV import suggestions ( "Vincent Medina" 2007)
  2. Re: CSV import suggestions ( "Vincent Medina" 2007)
  3. Re: CSV import suggestions ( "Vincent Medina" 2007)
  4. Re: CSV import suggestions ( Matthew A Perosi 2007)
  5. Re: CSV import suggestions ( Stuart Tremain 2007)
  6. Re: CSV import suggestions ( John Peacock 2007)
  7. Re: CSV import suggestions ( Donovan Brooke 2007)
  8. Re: CSV import suggestions ( "Gary Krockover" 2007)
  9. Re: CSV import suggestions ( Donovan Brooke 2007)
  10. Re: CSV import suggestions ( "Vincent Medina" 2007)
  11. Re: CSV import suggestions ( "Vincent Medina" 2007)
  12. Re: CSV import suggestions ( "Vincent Medina" 2007)
  13. Re: CSV import suggestions ( "Vincent Medina" 2007)
  14. Re: CSV import suggestions ( John Peacock 2007)
  15. Re: CSV import suggestions ( Donovan Brooke 2007)
  16. Re: CSV import suggestions ( Paul Willis 2007)
  17. Re: CSV import suggestions ( John Peacock 2007)
  18. CSV import suggestions ( "Vincent Medina" 2007)
Exactly the problem I am facing with outside vendors. I need a method that would convert csv formats that could potentially vary. I have screamed and screamed for tab delimited and I just GIVE up at this point!!! Send me the code to aplusmiami@gmail.com Vince -----Original Message----- From: WebDNA Talk [mailto:WebDNA-Talk@talk.smithmicro.com] On Behalf Of Matthew A Perosi Sent: Friday, July 06, 2007 12:50 AM To: WebDNA Talk Subject: Re: CSV import suggestions I saw all the contributions to this today and just didn't have time to add in my 2 cents until now. I'll explain what I do and if anyone wants to see my code I'll be glad to post it. I have several websites that needed CSV import routines. In my application I am dealing with vendors that are uploading product inventory. My online database is standardized, but the CSV files the vendors upload is different from vendor to vendor so every vendor has a slightly different CSV field to db mapping. What I found was that now matter how many requests (or demands) I made, the vendors were unreliable with the format of their data format. What I end up with are CSV files that indeed do have commas separating every cell, however not every cell was encapsulated in quotes. So I had to create a totally foolproof CSV import routine. It's very convoluted, but so far it's running since April 2006 without bugs. Here are the steps I take using an uploaded CSV file called export.txt, and a temp db for processing is called import.db. 1. Upload export.txt 2. closedatabase import.db 3. delete all records in import.db 4. rename export.txt to export.inc 5. read import.inc into a single variable called [import] 6. url wrap [import] 7. convert any pipes (%7C) into dash (-) characters... yes that seems strange, keep reading 8. convert carriage returns (%0D%0A) into the pipe (|) character because that will tell us where record ends. 9. convert & (%26) into dash (-) because that also tends to mess things up 10. Now strip out one record at a time from the [import] variable using [listwords Words=[import]&Delimiters=|] 11. Now step (filter) through every character of every stripped out record and make sure that every comma is converted into a TAB, unless the comma is between 2 quotes, in which case we keep the comma and erase the quotes. 12. After applying this convoluted filter from step 11 we are left with what looks like a normal WebDNA TAB db record. I then run [listwords words=[url][currentrecord][/url]&Tabs=T] and map every cell to the predetermined db cells for that vendor. I use the [index] within [listwords] to control the mapping and [replace] every cell into the import.db. 13. After all the data is sanitized through the above 12 steps I am left with a pure WebDNA db that I can do anything to. However, at that point I now validate every record against several additional predetermined import filters. If any record has incorrectly formatted data, the entire record is flagged for the vendor to manually check/correct. 14. After all the error checking from step 13 is completed the import.db I delete the vendor's entire live set of products and everything in import.db is loaded into the live products.db As I said, the vendors uploading the CSV files were not creating standard CSVs in this format: "1","2","3","4" So I could not rely on the built in ability for WebDNA to read that type of file. The above explanation does represent several weeks of work, and it works perfectly for my needs. I have since created another import routine when I'm sure the export.txt file is already TAB delimited, and another version when I am guaranteed not to have a comma in any cell between quotes. Hopefully the ideas can help you in the right direction. If anyone wants to see the code let me know and I'll be glad to provide everything because this is a useful tool in my bag of tricks. ----------------- Matthew A Perosi Psi Prime, Inc. 323 Union Blvd Totowa, NJ 07512 Phone: 973-413-8210 Fax: 973-413-8217 Vincent Medina wrote: > Anyone have any suggestions for me. I need to import a CSV file into a > webdna db. > > Vincent Medina > APCN > > > > ------------------------------------------------------------- > This message is sent to you because you are subscribed to > the mailing list . > To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to > Web Archive of this list is at: http://webdna.smithmicro.com/ > > > ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/ ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/ "Vincent Medina"

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

WebCatalog 2.0 b 15 mac (1997) 'page impression' techniques for banner ads (1999) select multiple 2 more cents (1997) Setting up shop (1997) WC Database Format (1997) errormessages.db (1997) Locking up with WebCatalog... (1997) OT fullfillment centers / service (2000) Loss in form (1998) CommandSecurity? (1997) Country & Ship-to address & other fields ? (1997) credit card services (1997) session handling (2004) Duplicate Messages (1998) Re(2): Re(2): SSL, WebSTAR, WebCatalog (1998) Sending E-mail (1997) WebCat2.0 [format thousands .0f] no go (1997) Error Lob.db records error message not name (1997) search for all (1998) Re2: frames & carts (1997)