• home
  • forum
  • my
  • kt
  • download
  • Why UICODE is important before you start program

    Author: 2007-09-06 10:59:59 From:

    First of all you must have question in mind:
    What is UNICODE?
    Unicode is a coding system in which each character is stored in 2 bytes (16bits) rather than simple 8 bit character. Usually the characters you see on a standard english keyboard are stored in one byte, to test it, you simply create new text file in notepad and type only one character in it, then save an check its size, it will report 1 byte.


    Why is it important for Windows Programmers
    When your applciation is enabled to compile with UNICODE as well as NON-UNICODE fucntions, then you have covered good range of Microsoft operating systems.
    Windows 98 accepts applications of non-unicode and Windows NT, XP, and Vista run all unicode applications natively, but they do accomodate non-unicode applciations too. So, when also compiling your software also for windows 98/95 gives chance for users who are still running old operating systems, that may result in good business too.

    For example, you have developed an Office Automation tool, that you sell to Windows NT, 200, XP, and Vista users, but do not provide a version for windows 98 and 95 users, so you are not covering full range. By providing all users same applciation, you are indirectly giving them idea to swith to new operating system too. And obviously increase in business.



    Single byte storage
    Now problem with 1 byte character storage is, it can store maximum value of 255 in 8 bits, so can store maximum 255 different characters in each byte. To solve this problem (keeping in view the international language) new system was introduced called 'unicode'. Using this system, we can store character in 2 bytes (16 bits), which gives us possibity to store huge variety of different values upto 65535. E.g. from range of 1665 to 1773 there are some arabic language characters.
    On the webpages to display this kind of characters you use &#THENUMBER; where 'THENUMBER' is the numaric value of the character.

    Let's come to the point. There are two kind of functions in windows api, the first one are non-unicode and the other are unicode. For example the function MessageBox is defined two way in windwos API, one MessageBoxA and second MessageBoxW, the one with 'A' on end is non-unicode version whcih accepts text defined using CHAR array, and the other one with ending 'W' is the unicode one which accepts wide character.

    Very simple to understand is use types:

    CHAR for non-unicode and
    WCHAR for unicode application

    NOTE:
      If you are using dev c++, your applciation is by default non-unicode, to enable it compile as unicode applciation simple put 2 defines on top of all the code as

    #define UNICODE
    #define _UNICODE


      And, if you are using Microsoft Visual Studio Express Edition, then your applciation is by default UNICODE enabled, to enable your application compile as non-unicode put 2 undefines on top of all code as

    #undef UNICODE
    #undef _UNICODE


    Besides, Mircorsoft Visul Studio provides a very good project options panel, through which you can change these settings, either to compile as UNICODE or Multibyte characterset applciation.



    So, do I have to maintain two separate source coes for same applciations?


    NO, the best solution is use TCHAR, as:

    TCHAR myStrVar;
    
    NOTE: you might need ot include <tchar.h> if you are using a compiler other than Microsoft Visual Studio

    And, enclose every character string within a macro _T( ), that acts as per current characterset accordingly.

    e.g. Following code will compile as both unicode and non-unicode applciation:
    MessageBox(NULL,_T("My test message"),_T("My test title"),0);



    PROBLEMS



    1.strcat and other standard string manipulation functions seem to be not compiled when compiling as UNICODE enabled, why?
    Well, when you are developing applciation for both you have to use windows specific functions, such as:
    use wcscat instead of strcat, use wsprintf instead of sprintf. There is windows specific alternate available for almost every string manipulatin function in windows. A full list of Win32 Equivalents for C Run-Time Functions is available at the end of this document.


    2.Now problem is when you pass a string variable to this _T() macro, it gives a compiler error, such as:

    TCHAR myStrVar;
    MessageBox(NULL,_T(myStrVar),_T(myStrVar),0);


    Solution is: Simply do not use _T() when passing variable instead of constant character string, as:

    TCHAR myStrVar;
    MessageBox(NULL,myStrVar,myStrVar,0);





    Win32 Equivalents for C Run-Time Functions


    Directly copied from Microsoft Win32 Software Development Kit (SDK), versions 3.1, 3.5, 3.51, and 4.0 Manual
    Please note the string BOLD text in following table


    SUMMARY

    Many of the C Run-time functions have direct equivalents in the Win32
    application programming interface (API). This article lists the C Run-time
    functions by category with their Win32 equivalents or the word "none" if no
    equivalent exists.


    MORE INFORMATION

    NOTE: the functions that are followed by an asterisk (*) are part of the
    16-bit C Run-time only. Functions that are unique to the 32-bit C Run-time
    are listed separately in the last section. All other functions are common
    to both C Run-times.

    Buffer Manipulation
    -------------------
    _memccpy none
    memchr none
    memcmp none
    memcpy CopyMemory
    _memicmp none
    memmove MoveMemory

    memset FillMemory, ZeroMemory
    _swab none

    Character Classification
    ------------------------
    isalnum IsCharAlphaNumeric
    isalpha IsCharAlpha, GetStringTypeW (Unicode)
    __isascii none
    iscntrl none, GetStringTypeW (Unicode)
    __iscsym none
    __iscsymf none
    isdigit none, GetStringTypeW (Unicode)

    isgraph none
    islower IsCharLower, GetStringTypeW (Unicode)
    isprint none
    ispunct none, GetStringTypeW (Unicode)
    isspace none, GetStringTypeW (Unicode)
    isupper IsCharUpper, GetStringTypeW (Unicode)
    isxdigit none, GetStringTypeW (Unicode)
    __toascii none
    tolower CharLower
    _tolower none

    toupper CharUpper
    _toupper none

    Directory Control
    -----------------
    _chdir SetCurrentDirectory
    _chdrive SetCurrentDirectory
    _getcwd GetCurrentDirectory
    _getdrive GetCurrentDirectory
    _mkdir CreateDirectory
    _rmdir RemoveDirectory
    _searchenv SearchPath

    File Handling
    -------------

    _access none
    _chmod SetFileAttributes
    _chsize SetEndOfFile
    _filelength GetFileSize
    _fstat See Note 5
    _fullpath GetFullPathName
    _get_osfhandle none
    _isatty GetFileType
    _locking LockFileEx
    _makepath none
    _mktemp GetTempFileName
    _open_osfhandle none

    remove DeleteFile
    rename MoveFile
    _setmode none
    _splitpath none
    _stat none
    _umask none
    _unlink DeleteFile

    Creating Text Output Routines
    -----------------------------
    _displaycursor* SetConsoleCursorInfo
    _gettextcolor* GetConsoleScreenBufferInfo
    _gettextcursor* GetConsoleCursorInfo

    _gettextposition* GetConsoleScreenBufferInfo
    _gettextwindow* GetConsoleWindowInfo
    _outtext* WriteConsole
    _scrolltextwindow* ScrollConsoleScreenBuffer
    _settextcolor* SetConsoleTextAttribute
    _settextcursor* SetConsoleCursorInfo
    _settextposition* SetConsoleCursorPosition
    _settextwindow* SetConsoleWindowInfo
    _wrapon* SetConsoleMode

    Stream Routines

    ---------------
    clearerr none
    fclose CloseHandle
    _fcloseall none
    _fdopen none
    feof none
    ferror none
    fflush FlushFileBuffers
    fgetc none
    _fgetchar none
    fgetpos none
    fgets none
    _fileno none
    _flushall none

    fopen CreateFile
    fprintf none
    fputc none
    _fputchar none
    fputs none
    fread ReadFile
    freopen (std handles) SetStdHandle
    fscanf none
    fseek SetFilePointer
    fsetpos SetFilePointer
    _fsopen CreateFile
    ftell SetFilePointer (check return value)

    fwrite WriteFile
    getc none
    getchar none
    gets none
    _getw none
    printf none
    putc none
    putchar none
    puts none
    _putw none
    rewind SetFilePointer
    _rmtmp none
    scanf none
    setbuf none

    setvbuf none
    _snprintf none
    sprintf wsprintf
    sscanf none
    _tempnam GetTempFileName
    tmpfile none
    tmpnam GetTempFileName
    ungetc none
    vfprintf none
    vprintf none
    _vsnprintf none
    vsprintf wvsprintf

    Low-Level I/O
    -------------

    _close _lclose, CloseHandle
    _commit FlushFileBuffers
    _creat _lcreat, CreateFile
    _dup DuplicateHandle
    _dup2 none
    _eof none
    _lseek _llseek, SetFilePointer
    _open _lopen, CreateFile
    _read _lread, ReadFile
    _sopen CreateFile
    _tell SetFilePointer (check return value)

    _write _lread

    Console and Port I/O Routines
    -----------------------------
    _cgets none
    _cprintf none
    _cputs none
    _cscanf none
    _getch ReadConsoleInput
    _getche ReadConsoleInput
    _inp none
    _inpw none
    _kbhit PeekConsoleInput
    _outp none

    _outpw none
    _putch WriteConsoleInput
    _ungetch none

    Memory Allocation
    -----------------
    _alloca none
    _bfreeseg* none
    _bheapseg* none
    calloc GlobalAlloc
    _expand none
    free GlobalFree
    _freect* GlobalMemoryStatus
    _halloc* GlobalAlloc
    _heapadd none

    _heapchk none
    _heapmin none
    _heapset none
    _heapwalk none
    _hfree* GlobalFree
    malloc GlobalAlloc
    _memavl GlobalMemoryStatus
    _memmax GlobalMemoryStatus
    _msize* GlobalSize
    realloc GlobalReAlloc
    _set_new_handler none
    _set_hnew_handler* none
    _stackavail* none

    Process and Environment Control Routines
    ----------------------------------------
    abort none
    assert none
    atexit none
    _cexit none
    _c_exit none
    _exec functions none
    exit ExitProcess
    _exit ExitProcess
    getenv GetEnvironmentVariable
    _getpid GetCurrentProcessId

    longjmp none
    _onexit none
    perror FormatMessage
    _putenv SetEnvironmentVariable
    raise RaiseException
    setjmp none
    signal (ctrl-c only) SetConsoleCtrlHandler
    _spawn functions CreateProcess
    system CreateProcess

    String Manipulation


    -------------------
    strcat, wcscat lstrcat
    strchr, wcschr none

    strcmp, wcscmp lstrcmp
    strcpy, wcscpy lstrcpy
    strcspn, wcscspn none
    _strdup, _wcsdup none
    strerror FormatMessage
    _strerror FormatMessage
    _stricmp, _wcsicmp lstrcmpi
    strlen, wcslen lstrlen
    _strlwr, _wcslwr CharLower, CharLowerBuffer
    strncat, wcsncat none
    strncmp, wcsncmp none
    strncpy, wcsncpy none
    _strnicmp, _wcsnicmp none

    _strnset, _wcsnset FillMemory, ZeroMemory
    strpbrk, wcspbrk none
    strrchr, wcsrchr none
    _strrev, _wcsrev none
    _strset, _wcsset FillMemory, ZeroMemory
    strspn, wcsspn none
    strstr, wcsstr none
    strtok, wcstok none
    _strupr, _wcsupr CharUpper, CharUpperBuffer

    MS-DOS Interface
    ----------------
    _bdos* none
    _chain_intr* none

    _disable* none
    _dos_allocmem* GlobalAlloc
    _dos_close* CloseHandle
    _dos_commit* FlushFileBuffers
    _dos_creat* CreateFile
    _dos_creatnew* CreateFile
    _dos_findfirst* FindFirstFile
    _dos_findnext* FindNextFile
    _dos_freemem* GlobalFree
    _dos_getdate* GetSystemTime
    _dos_getdiskfree* GetDiskFreeSpace
    _dos_getdrive* GetCurrentDirectory

    _dos_getfileattr* GetFileAttributes
    _dos_getftime* GetFileTime
    _dos_gettime* GetSystemTime
    _dos_getvect* none
    _dos_keep* none
    _dos_open* OpenFile
    _dos_read* ReadFile
    _dos_setblock* GlobalReAlloc
    _dos_setdate* SetSystemTime
    _dos_setdrive* SetCurrentDirectory
    _dos_setfileattr* SetFileAttributes
    _dos_setftime* SetFileTime

    _dos_settime* SetSystemTime
    _dos_setvect* none
    _dos_write* WriteFile
    _dosexterr* GetLastError
    _enable* none
    _FP_OFF* none
    _FP_SEG* none
    _harderr* See Note 1
    _hardresume* See Note 1
    _hardretn* See Note 1
    _int86* none
    _int86x* none
    _intdos* none

    _intdosx* none
    _segread* none

    Time
    ----
    asctime See Note 2
    clock See Note 2
    ctime See Note 2
    difftime See Note 2
    _ftime See Note 2
    _getsystime GetLocalTime
    gmtime See Note 2
    localtime See Note 2
    mktime See Note 2
    _strdate See Note 2

    _strtime See Note 2
    time See Note 2
    _tzset See Note 2
    _utime SetFileTime

    Virtual Memory Allocation
    -------------------------
    _vfree* See Note 3
    _vheapinit* See Note 3
    _vheapterm* See Note 3
    _vload* See Note 3
    _vlock* See Note 3
    _vlockcnt* See Note 3
    _vmalloc* See Note 3

    _vmsize* See Note 3
    _vrealloc* See Note 3
    _vunlock* See Note 3

    32-Bit C Run Time
    ------------------
    _beginthread CreateThread
    _cwait WaitForSingleObject w/ GetExitCodeProcess
    _endthread ExitThread
    _findclose FindClose
    _findfirst FindFirstFile
    _findnext FindNextFile
    _futime SetFileTime

    _get_osfhandle none
    _open_osfhandle none
    _pclose See Note 4
    _pipe CreatePipe
    _popen See Note 4


    NOTE 1: The _harderr functions do not exist in the Win32 API. However, much
    of their functionality is available through structured exception handling.

    NOTE 2: The time functions are based on a format that is not used in Win32.
    There are specific Win32 time functions that are documented in the Help

    file.

    NOTE 3: The virtual memory functions listed in this document are specific
    to the MS-DOS environment and were written to access memory beyond the 640K
    of RAM available in MS-DOS. Because this limitation does not exist in
    Win32, the standard memory allocation functions should be used.

    NOTE 4: While _pclose() and _popen() do not have direct Win32 equivalents,
    you can (with some work) simulate them with the following calls:

    _popen CreatePipe

    CreateProcess

    _pclose WaitForSingleObject
    CloseHandle

    NOTE 5: GetFileInformationByHandle() is the Win32 equivalent for the
    _fstat() C Run-time function. However, GetFileInformationByHandle() is not
    supported by Win32s version 1.1. It is supported in Win32s 1.2.
    GetFileSize(), GetFileAttributes(), GetFileTime(), and GetFileTitle() are
    supported by Win32s 1.1 and 1.2.

    discuss this topic to forum

    relation tutorial

    No relevant information

    Category

      Basics (9)

    New

    Hot