Skip to contents

hex2ucp(), int2ucp(), name2ucp(), and str2ucp() return Unicode code points as character vectors. is_ucp() returns TRUE if a valid Unicode code point.

Usage

hex2ucp(x)

int2ucp(x)

str2ucp(x)

name2ucp(x, type = c("exact", "grep"), ...)

is_ucp(x)

block2ucp(x, omit_unnamed = TRUE)

range2ucp(x, omit_unnamed = TRUE)

Arguments

x

R objects coercible to the respective Unicode character data types. See Unicode::as.u_char() for hex2ucp() and int2ucp(), base::utf8ToInt() for str2ucp(), Unicode::u_char_from_name() for name2ucp(), Unicode::as.u_char_range() for range2ucp(), and Unicode::u_blocks() for block2ucp().

type

one of "exact" or "grep", or an abbreviation thereof.

...

arguments to be passed to grepl when using this for pattern matching.

omit_unnamed

Omit control codes or unassigned code points

Value

A character vector of Unicode code points.

Details

hex2ucp(x) is a wrapper for as.character(Unicode::as.u_char(toupper(x))). int2ucp is a wrapper for as.character(Unicode::as.u_char(as.integer(x))). str2ucp(x) is a wrapper for as.character(Unicode::as.u_char(utf8ToInt(x))). name2ucp(x) is a wrapper for as.character(Unicode::u_char_from_name(x)). However missing values are coerced to NA_character_ instead of "<NA>". Note the names of bm_font() objects must be character vectors as returned by these functions and not Unicode::u_char objects.

Examples

  # These are all different ways to get the same 'R' code point
  hex2ucp("52")
#> [1] "U+0052"
  hex2ucp(as.hexmode("52"))
#> [1] "U+0052"
  hex2ucp("0052")
#> [1] "U+0052"
  hex2ucp("U+0052")
#> [1] "U+0052"
  hex2ucp("0x0052")
#> [1] "U+0052"
  int2ucp(82) # 82 == as.hexmode("52")
#> [1] "U+0052"
  int2ucp("82") # 82 == as.hexmode("52")
#> [1] "U+0052"
  int2ucp(utf8ToInt("R"))
#> [1] "U+0052"
  ucp2label("U+0052")
#> [1] "LATIN CAPITAL LETTER R"
  name2ucp("LATIN CAPITAL LETTER R")
#> [1] "U+0052"
  str2ucp("R")
#> [1] "U+0052"

  block2ucp("Basic Latin")
#>  [1] "U+0020" "U+0021" "U+0022" "U+0023" "U+0024" "U+0025" "U+0026" "U+0027"
#>  [9] "U+0028" "U+0029" "U+002A" "U+002B" "U+002C" "U+002D" "U+002E" "U+002F"
#> [17] "U+0030" "U+0031" "U+0032" "U+0033" "U+0034" "U+0035" "U+0036" "U+0037"
#> [25] "U+0038" "U+0039" "U+003A" "U+003B" "U+003C" "U+003D" "U+003E" "U+003F"
#> [33] "U+0040" "U+0041" "U+0042" "U+0043" "U+0044" "U+0045" "U+0046" "U+0047"
#> [41] "U+0048" "U+0049" "U+004A" "U+004B" "U+004C" "U+004D" "U+004E" "U+004F"
#> [49] "U+0050" "U+0051" "U+0052" "U+0053" "U+0054" "U+0055" "U+0056" "U+0057"
#> [57] "U+0058" "U+0059" "U+005A" "U+005B" "U+005C" "U+005D" "U+005E" "U+005F"
#> [65] "U+0060" "U+0061" "U+0062" "U+0063" "U+0064" "U+0065" "U+0066" "U+0067"
#> [73] "U+0068" "U+0069" "U+006A" "U+006B" "U+006C" "U+006D" "U+006E" "U+006F"
#> [81] "U+0070" "U+0071" "U+0072" "U+0073" "U+0074" "U+0075" "U+0076" "U+0077"
#> [89] "U+0078" "U+0079" "U+007A" "U+007B" "U+007C" "U+007D" "U+007E"
  block2ucp("Basic Latin", omit_unnamed = FALSE)
#>   [1] "U+0000" "U+0001" "U+0002" "U+0003" "U+0004" "U+0005" "U+0006" "U+0007"
#>   [9] "U+0008" "U+0009" "U+000A" "U+000B" "U+000C" "U+000D" "U+000E" "U+000F"
#>  [17] "U+0010" "U+0011" "U+0012" "U+0013" "U+0014" "U+0015" "U+0016" "U+0017"
#>  [25] "U+0018" "U+0019" "U+001A" "U+001B" "U+001C" "U+001D" "U+001E" "U+001F"
#>  [33] "U+0020" "U+0021" "U+0022" "U+0023" "U+0024" "U+0025" "U+0026" "U+0027"
#>  [41] "U+0028" "U+0029" "U+002A" "U+002B" "U+002C" "U+002D" "U+002E" "U+002F"
#>  [49] "U+0030" "U+0031" "U+0032" "U+0033" "U+0034" "U+0035" "U+0036" "U+0037"
#>  [57] "U+0038" "U+0039" "U+003A" "U+003B" "U+003C" "U+003D" "U+003E" "U+003F"
#>  [65] "U+0040" "U+0041" "U+0042" "U+0043" "U+0044" "U+0045" "U+0046" "U+0047"
#>  [73] "U+0048" "U+0049" "U+004A" "U+004B" "U+004C" "U+004D" "U+004E" "U+004F"
#>  [81] "U+0050" "U+0051" "U+0052" "U+0053" "U+0054" "U+0055" "U+0056" "U+0057"
#>  [89] "U+0058" "U+0059" "U+005A" "U+005B" "U+005C" "U+005D" "U+005E" "U+005F"
#>  [97] "U+0060" "U+0061" "U+0062" "U+0063" "U+0064" "U+0065" "U+0066" "U+0067"
#> [105] "U+0068" "U+0069" "U+006A" "U+006B" "U+006C" "U+006D" "U+006E" "U+006F"
#> [113] "U+0070" "U+0071" "U+0072" "U+0073" "U+0074" "U+0075" "U+0076" "U+0077"
#> [121] "U+0078" "U+0079" "U+007A" "U+007B" "U+007C" "U+007D" "U+007E" "U+007F"
  range2ucp("U+0020..U+0030")
#>  [1] "U+0020" "U+0021" "U+0022" "U+0023" "U+0024" "U+0025" "U+0026" "U+0027"
#>  [9] "U+0028" "U+0029" "U+002A" "U+002B" "U+002C" "U+002D" "U+002E" "U+002F"
#> [17] "U+0030"