Implementation
Let’s have the database table BUSINESS_ENTITY with the following columns:-
- NAME
- PARENT_NAME
- COUNTRY_NAME
- INDUSTRY_GROUP
- ADDRESS
- CITY
GRANT EXECUTE ON CTX_DDL TO <DB USERNAME>;From my experience working in a company for bespoke software development, in order to give the ability of a case-insensitive search we need to create a dedicated BASIC_LEXER. This lexer identifies tokens for creating text indexes for English and all other supported whitespace delimited languages.This will give EXECUTE privilege on CTX_DDL PL/SQL package to the database user of our application. The package creates and manages the preferences, section groups, and stoplists required for text indexes. Execute the following SQL query:
begin
ctx_ddl.create_preference('business_entity_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('business_entity_lexer', 'mixed_case', 'NO');
end;
Execute the following SQL query:
Next, we need to create a MULTI_COLUMN_DATASTORE. It’s definition will be consisted of all the column names of our BUSINESS_ENTITY table. This will allow us to search across all the entity columns while maintaining low complexity when generating the search SQL query later.
begin
ctx_ddl.create_preference('business_entity_datastore', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute('business_entity_datastore', 'columns', 'name, parent_name,
country_name, industry_group, address, city');
end;
Finally we need to create the text index. Notice that we are creating index only on one of the table columns, but providing the created MULTI_COLUMN_DATASTORE as well as the BASIC_LEXER will ensure us case-insensitive search across all the BUSINESS_ENTITY table columns.
create index be_name_idx on business_entity (name) indextype is ctxsys.context parameters
('datastore business_entity_datastore lexer business_entity_lexer')
Finally the SQL query which will be eecuted after the user’s input search string has been passed to the back-end from the input form. The idea here is that the input string is split by empty string and percentage symbol is added at the beginning and the end of every word. This will ensure the substring match. The good thing is that we can execute this SQL with provided named parameter, rather than concatenated SQL with user input which is a bad practice leading to SQL injection vulnerability.
SELECT NAME, PARENT_NAME, COUNTRY_NAME, INDUSTRY_GROUP, ADDRESS, CITY FROM BUSINESS_ENTITY WHERE CONTAINS(NAME, ‘%google% OR %ireland%’) > 1Notice the ability of using boolean operators in concatenation with the split user input string from the search form.
Maintenance
Naturally the question “How we are going to maintain the created text indexes when new records are added/updated in the database table?” comes in every developers mind. I would not get into details here, but my suggestion is to check the “Sync (On Commit) new in Oracle 10g”. Oracle’s proposal here is to take advantage of the sync (on commit) parameter when creating the text index. Another approach is the index synchronization to be executed on predefined time period with sync (Every ...) parameter.What’s the point
Oracle Text is a great feature and you should consider it if your organization needs powerful search capabilities under the following circumstances:- The application already uses Oracle DB in it’s persistence layer
- Adding new technology is not on the agenda
- You want to avoid the additional DevOps overhead in terms of monitoring, backup and infrastructure of a separate search engine
