[ale] perl/postgresql question

Jim Kinney jim.kinney at gmail.com
Thu Mar 31 16:52:42 EDT 2011


I was searching for a clue and saw the post on the pg list :-) Got all
excited thinking I had found a solution to point to you to! Then I realized,
"Hey! This looks REALLY similar..."

Maybe a single run then a grep for non-ascii characters?

On Thu, Mar 31, 2011 at 4:11 PM, Geoffrey Myers <lists at serioustechnology.com
> wrote:

> Jim Kinney wrote:
> > The "g" _should_ do it. Quite puzzling.
> > Can you do a run and find an example of when it fails? Is it always 0xbd?
>
> The sad thing is, all it tells me is which table in the database has the
> error.  It does not identify the field.  I've posted to the Postgres
> list to see if there's any way to narrow down the culprit, there is not.
>
> Once it fails, it stops loading that table, so I never get pass this error.
>
> When I first caught this error, I was simply trying to load an SQL_ASCII
> database dump into a UTF8 database.  That's when I wrote my cleandata.pl
> code.  The code cleans 62750 records.  I just don't understand why it
> doesn't fix this one.
>
> > Maybe someone in Tibet doesn't like your code :-)  (It's a Tibetan
> > character code in utf8)
>
> But what is it in SQL_ASCII?  What happened is, some enterprising users
> figured out how to enter special characters from the keyboard, say, like
> a degrees symbol.  That goes in to the SQL_ASCII database fine, but when
> you try to load that into the UTF8 encoded database, it pukes.
>
> >
> > On Thu, Mar 31, 2011 at 2:59 PM, Geoffrey Myers
> > <lists at serioustechnology.com <mailto:lists at serioustechnology.com>>
> wrote:
> >
> >     Jim Kinney wrote:
> >      > should it be replaced with nothing or with a NULL? Does the perl
> >     pointer
> >      > jump to the next character automatically when a match is made thus
> >      > skipping a character in your replace? Does a double run fix it?
> >
> >     Missed the first question.  It should be removed, not replaced with
> an
> >     null.  For example I have:
> >
> >     This is my <UGLY CHARACTER>data
> >
> >     Should then be:
> >
> >     This is my data
> >
> >      >
> >      > On Thu, Mar 31, 2011 at 12:04 PM, Geoffrey Myers
> >      > <lists at serioustechnology.com <mailto:lists at serioustechnology.com>
> >     <mailto:lists at serioustechnology.com
> >     <mailto:lists at serioustechnology.com>>> wrote:
> >      >
> >      >     So, we are trying to convert a number of postgresql databases
> >     that were
> >      >     created with SQL_ASCII encoding to UTF8 encoding.  As such, I
> >     need to
> >      >     strip certain characters out of the data before dumping and
> >     reloading
> >      >     the new databases.  I'm using the following:
> >      >
> >      >     data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8))
> >      >                     || (ord($1) == 11)
> >      >                     || ((ord($1) >= 13) && (ord($1) <= 31))
> >      >                     || ((ord($1) >= 127)) ?"": $1/egs;
> >      >
> >      >     Yet I'm getting the following error, which indicates a
> >     character that
> >      >     should be handled by the above code is still in the data:
> >      >
> >      >     pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte
> >     sequence
> >      >     for encoding "UTF8": 0xbd
> >      >
> >      >
> >      >     Certainly, the above code should replace the 0xbd with
> nothing?
> >      >
> >      >     Any perl/postgres gurus out there?
> >      >
> >      >     --
> >      >     Until later, Geoffrey
> >      >
> >      >     "I predict future happiness for America if they can prevent
> >      >     the government from wasting the labors of the people under
> >      >     the pretense of taking care of them."
> >      >     - Thomas Jefferson
> >      >     _______________________________________________
> >      >     Ale mailing list
> >      >     Ale at ale.org <mailto:Ale at ale.org> <mailto:Ale at ale.org
> >     <mailto:Ale at ale.org>>
> >      >     http://mail.ale.org/mailman/listinfo/ale
> >      >     See JOBS, ANNOUNCE and SCHOOLS lists at
> >      >     http://mail.ale.org/mailman/listinfo
> >      >
> >      >
> >      >
> >      >
> >      > --
> >      > --
> >      > James P. Kinney III
> >      > I would rather stumble along in freedom than walk effortlessly in
> >     chains.
> >      >
> >      >
> >      >
> >      >
> >
> ------------------------------------------------------------------------
> >      >
> >      > _______________________________________________
> >      > Ale mailing list
> >      > Ale at ale.org <mailto:Ale at ale.org>
> >      > http://mail.ale.org/mailman/listinfo/ale
> >      > See JOBS, ANNOUNCE and SCHOOLS lists at
> >      > http://mail.ale.org/mailman/listinfo
> >
> >
> >     --
> >     Until later, Geoffrey
> >
> >     "I predict future happiness for America if they can prevent
> >     the government from wasting the labors of the people under
> >     the pretense of taking care of them."
> >     - Thomas Jefferson
> >     _______________________________________________
> >     Ale mailing list
> >     Ale at ale.org <mailto:Ale at ale.org>
> >     http://mail.ale.org/mailman/listinfo/ale
> >     See JOBS, ANNOUNCE and SCHOOLS lists at
> >     http://mail.ale.org/mailman/listinfo
> >
> >
> >
> >
> > --
> > --
> > James P. Kinney III
> > I would rather stumble along in freedom than walk effortlessly in chains.
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Ale mailing list
> > Ale at ale.org
> > http://mail.ale.org/mailman/listinfo/ale
> > See JOBS, ANNOUNCE and SCHOOLS lists at
> > http://mail.ale.org/mailman/listinfo
>
>
> --
> Until later, Geoffrey
>
> "I predict future happiness for America if they can prevent
> the government from wasting the labors of the people under
> the pretense of taking care of them."
> - Thomas Jefferson
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>



-- 
-- 
James P. Kinney III
I would rather stumble along in freedom than walk effortlessly in chains.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20110331/6d66bab7/attachment-0001.html 


More information about the Ale mailing list