This online help file is for EFT Server version 6.3.x. For other versions of EFT Server, please refer to http://help.globalscape.com/help/index.html. (If the Index and Contents are hidden, click Show Contents pane in the top left corner of this topic.) |
Unicode support has been added to EFT Server in version 6.3. The FAQs below are provided to answer questions you may have regarding EFT Server's Unicode support.
Q. What is Unicode?
A. EFT Server 6.3 provides the ability to see (in a directory listing) and transfer files that have non-English filenames, such as 一条大鱼.txtm 큰 물고기.xml, or as 大きい魚.exe. In previous versions, these files would appear as ???????.exe or .xml. FT Server 6.3 supports the UTF-8 variable byte Unicode encoding standard for character encoding. (Previous versions of EFT Server only understand ASCII.) The legacy format requires matching (or corresponding) ANSI code pages in order to view filenames in the intended format (on the target system).
Q. Does EFT Server support international characters, UTF-8, Unicode, or double-byte characters?
A. EFT Server preserves UTF-8 encoded filenames (a method of encoding Unicode) when transporting files over HTTP and SFTP, and when working with the Advanced Workflow Engine (AWE) module. There is no support for UTF-8 in Event Rules, with the exception of AWE, nor any support in EFT Server’s administration interface nor any auditing or reporting of UTF-8 encoded filenames. (see table below).
Q. What Unicode encoding mechanisms does EFT Server support?
A. At the file system (read/write) level, EFT Server will encode using USC-2. EFT Server re-encodes to UTF-8, which is a popular variable-length Unicode-encoding format, when provisioning filenames over SFTP and HTTP.
Q, Which clients can I use to see Unicode filenames when I transfer files?
A. The Web Transfer Client (WTC) and Plain-Text Client (PTC) both support UTF-8. CuteFTP does NOT support UTF-8, therefore Unicode filenames will continue to appear as "???????.exe" or ".xml" when using CuteFTP to transfer files to/from EFT Server.
Q How do the Event Rule Actions handle Unicode filenames?
A. EFT Server’s Event Rules will essentially ignore files with non-ASCII file names.
Q. What happens if I have a file with a UTF-8 encoded Unicode filename such as 梅雨右折車線_US.ISO that was transferred to EFT Server over SFTP? How will that filename appear on disk? How will it appear in my reports? How will EFT Server’s Event Rules process the file?
A. EFT Server stores the file to disk conserving the original Unicode filename; however, the filename is converted to ASCII for auditing purposes, resulting in an audited filename of ???????_US.iso. Likewise, when a report is generated, the filename ??????_US.iso appears in the report. The reason the last three characters and the extension are conserved is that UTF-8 and ASCII characters are identical for English characters; therefore, for English characters there will be no loss of meaning (fidelity) after performing a UTF-8-to-ASCII conversion. This same UTF-8-to-ASCII conversion is used when handing off the filename to an Event Rule Action, which may result in an Action failure. If data integration of UTF-8 encoded filenames is needed, then you should consider deploying AWE, rather than using EFT Server’s Event Rules alone.
Q How do Unicode filenames appear in EFT Server’s administration interface?
A. They are "down converted" to ASCII and appear as "???????.exe" or ".xml."
Q. Will Unicode encoded filenames be preserved in EFT Server’s context variables, such as FS.FILENAME or FS.PATH?
A. Yes and no. For all Event Rule Events, Conditions, and Actions (except Advanced Workflows), EFT Server dereferences the variable as ASCII, i.e., downcast the filename. In the case of Advanced Workflows, EFT Server conserves the UTF-8 encoded filename when dereferenced, so that AWE can consume the original UTF-8 encoded filename, as AWE is UTF-8 aware.
Q. If a file is transported via FTP, which only supports ASCII, then how will that file be processed by the OnUpload Event Rule? Will the file be converted to UTF-8 and then back to ASCII, or kept in ASCII? What if that file needs to be consumed by an Advanced Workflow (which supports UTF-8)?
A. The filename will be converted from ASCII to UTF-8 and then prior to processing by the event, back to ASCII. In the case of an Advanced Workflow there will be no conversion back to ASCII. Therefore the sequence is either ASCII->UTF8->ASCII (for all actions except AWE) or ASCII->UTF8 (for AWE). In this situation there will be no loss of filename fidelity. What about FTP transfers to disk? In this case, the conversion will be ASCII->UCS2. This can cause a Folder Monitor Event to fire and then the whole sequence would be ASCII-> UCS2->UTF8->ASCII. Keep in mind that UCS2<->UTF8 conversions do not lose fidelity. The only time filename fidelity can be lost is when converting to ASCII from any other encoding. That is, if the final output is UTF8 or UCS2 then you are safe. If the final output is ASCII, then you are only safe if the original (input) was ASCII. If the initial encoding was UTF8, then you may end up with????, but only for characters outside the 255-character ASCII range.
Protocols |
|
Auditing |
||
HTTP (all user agents) |
UTF-8 |
|
Audit to ARM database |
ASCII |
AS2/HTTP |
UTF-8 |
|
Component One Reports |
ASCII |
FTP |
ASCII |
|
Log (IIS) |
ASCII |
SFTP (Optional ASCII override via registry) |
UTF-8 |
|
Log (NCSA) |
ASCII |
Event Rules |
|
Log (W3C) |
ASCII |
|
Context Variable (internal representation) |
UTF-8 |
|
Log4Cplus |
UTF-8 |
Context Variable (replace at runtime) |
ASCII |
|
CL log |
ASCII |
LAN Copy |
ASCII |
|
Client logs (extended logs) |
ASCII |
Offload/Download SFTP |
ASCII |
|
Administration Interface |
|
Offload/Download FTP |
ASCII |
|
Admin GUI (Usernames, Event names, etc) |
ASCII |
Offload/Download HTTP |
ASCII |
|
VFS (folder names) |
ASCII |
Generate Report |
ASCII |
|
Status Viewer |
ASCII |
Write to Windows Event Log |
ASCII |
|
Monitor page |
ASCII |
Backup |
ASCII |
|
Other UI |
ASCII |
Clean |
ASCII |
|
|
|
OpenPGP |
ASCII |
|
|
|
Web Services |
ASCII |
|
|
|
Automatedworkflow.xml |
UTF-8 |
|
|
|
AWE (version 8) |
UTF-8 |
|
|