The omega CGI assumes that text in the database is encoded as UTF-8.
If you are writing your own search form, it is best to ensure that the query will be sent as UTF-8. By default, web browsers will send the form parameters with the same encoding as the page the form is on (and the default encoding for HTML pages is ISO-8859-1). You can override this by adding the parameter accept-charset="UTF-8" to the <form> tag of your search form (and it's safe to do this even in a page which is explicitly UTF-8).
If the form parameters get sent as ISO-8859-1, there are several issues:
The first is that characters which aren't representable in ISO-8859-1 get sent as numeric HTML entities, such as 文. But there's no way to distinguish these from the same text literally entered into the form by the user.
The second is that Omega can't simply re-encode the form data as the encoding used isn't specified in the form submission (whether that is by GET or POST).
If Xapian is asked to parse a query string which isn't valid UTF-8, it will fall-back to handling it as ISO-8859-1, which will usually do the right thing for queries which are representable in ISO-8859-1. However, things like boolean filters in B parameters will be used as-is, so any which contain non-ASCII characters won't work properly.
When using omindex to index, this should automatically be the case - omindex converts text extracted from documents to UTF-8 if it isn't already in this encoding. There's built-in code to handle the following: ISO-8859-1, ISO-8859-15, WINDOWS-1252, CP-1252, UTF-16, UCS-2, UTF-16BE, UCS-2BE, UTF-16LE, UCS-2LE. And if built with iconv, many other encodings can be handled.
For plain text, omindex looks for a Byte Order Mark (BOM) to recognise UTF-8, UTF-16BE, UTF-16LE, UTF-32BE and UTF-32LE. Otherwise files are assumed to be UTF-8, or ISO-8859-1 if they contain byte sequences which aren't valid UTF-8.
When omindex builds URLs, it percent-encodes bytes according to RFC-3986. On modern systems, filenames are usually encoded in UTF-8, and the bytes which make up multi-byte UTF-8 sequences will get encoded. In Omega 1.2.21 or 1.3.3 and later, the OmegaScript $prettyurl command will reverse this encoding for valid UTF-8 sequences, and so filenames should be shown with only the bare minimum of characters escaped.
However, if your filenames aren't encoded in UTF-8, $prettyurl will leave alone percent-encoded bytes for non-ASCII characters (it is possible it could find a valid UTF-8 sequence in other data and so show the wrong character, but this is unlikely in real-world data). Everything should still work at least.
When using scriptindex, you should ensure that text you feed to scriptindex is UTF-8.