I am currently implementing a sqlite dll extension for having the regexp replace...
it is so useful that I decided to share it with you all.
first of all, I went with boost + icu because I met some issues with Qt. I still didn't find any workaround for using Qt...
I am working on a mac os x 10.6.3 and I don't know if it works on other platform, maybe yes, maybe not... Let me know :-)
I made a c++ file that I named sqliteextensions.cpp
#include "/usr/local/include/sqlite3ext.h"
SQLITE_EXTENSION_INIT1
#include <unicode/unistr.h>
#include <boost/regex/icu.hpp>
using namespace std;
extern "C" void sqlite3_regexp_replace(sqlite3_context *context, int argc, sqlite3_value **argv){
UnicodeString column_value_uft8 = UnicodeString::fromUTF8((const char*)sqlite3_value_text(argv[0]));
const unsigned char* pattern_utf8 = sqlite3_value_text(argv[1]);
UnicodeString pattern_replace_utf8 = UnicodeString::fromUTF8((const char*)sqlite3_value_text(argv[2]));
const boost::u32regex e = boost::make_u32regex(pattern_utf8); //from docs: UTF-8 when char size = 1
UnicodeString us = boost::u32regex_replace(column_value_uft8, e, pattern_replace_utf8);
string cs;
us.toUTF8String(cs);
sqlite3_result_text( context, cs.c_str(), -1, SQLITE_TRANSIENT );
}
extern "C" int sqlite3_extension_init(
sqlite3 *db,
char **pzErrMsg,
const sqlite3_api_routines *pApi
){
SQLITE_EXTENSION_INIT2(pApi)
sqlite3_create_function(db, "regexp_replace", 3, SQLITE_UTF8, 0, sqlite3_regexp_replace, 0, 0);
return 0;
}
then you compile this file using the command line below:
g++ -O3 -shared src/sqliteextensions.cpp \
-arch x86_64 -Xarch_x86_64 \
-framework CoreFoundation \
-L/usr/local/lib/ -lboost_regex \
-licudata -licui18n -licuio -licule -liculx -licutu -licuuc \
-o ../libsqliteextensions.dylib
now run sqlite3 (for me located in /usr/local/bin/ to have the latest version that I installed)
and type this command:
.load ../../libsqliteextensions.dylib
now you can use the regexp_replace function like:
update mytable set myfield = regexp_replace(myfield,'^[ \t]+','')
... and much more ... That's really great, isn't it?
it made my day :-)
I forgot to mention that I used ICU 4.2 and boost 1.43.
boost has been compiled with ICU, to make the regex lib aware of unicode.
Write to you soon.
Sylvain
Sylvian, could you publish this as a DLL that I can load with sqlite? I don't know how to compile this, but I would like to do a search & replace inside an sqlite db, and reading your blog, I think this is what do the trick, right?
RépondreSupprimerthanks
gert
unfortunately I don't have the time to publish and maintain a dll. However I gave all the instructions to build your lib on mac os x, I don't know how it would work on windows through.
SupprimerHello,
RépondreSupprimerIt seems that the #include lines are in incomplete, no ?
Your extension interest me, I try to compile boost with ICU support.
Thanks! I fixed it
Supprimer