Stream Fusion on Haskell Unicode Strings

نویسنده

  • Thomas Harper
چکیده

Prior papers have presented a fusion framework called stream fusion for removing intermediate data structures from both lists and arrays in Haskell. Stream fusion is unique in using an explicit datatype to accomplish fusion. We demonstrate how this can be exploited in the creation of a new Haskell string representation Text, which achieves better performance and data density than String . Text uses streams not only to accomplish fusion, but also as a way to abstract away from various underlying representations. This allows the same set of combinators to manipulate Unicode text that is stored in a variety of ways.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haskell Beats C Using Generalized Stream Fusion

Stream fusion [6] is a powerful technique for automatically transforming high-level sequence-processing functions into efficient implementations. It has been used to great effect in Haskell libraries for manipulating byte arrays, Unicode text, and unboxed vectors. However, some operations, like vector append, still do not perform well within the standard stream fusion framework. Others, like SI...

متن کامل

Implementations of Bidirectional Reordering Algorithms L 2 / 01 - 218

The goal of this paper is to contribute to a deeper understanding of the Unicode Bidirectional Reference Algorithm. We have provided an alternative reference algorithm written in the functional language Haskell. The advantage of Haskell is that it allows for a short, clear description of a complex problem. We have run our algorithm, the two Unicode reference implementations, and four others (IC...

متن کامل

Stream Fusion

Stream Fusion [1] is a system for removing intermediate list structures from Haskell programs; it consists of a Haskell library along with several compiler rewrite rules. (The library is available online at http://www.cse.unsw.edu.au/∼dons/streams.html.) These theories contain a formalization of much of the Stream Fusion library in HOLCF. Lazy list and stream types are defined, along with coerc...

متن کامل

Rewriting Haskell Strings

The Haskell String type is notoriously inefficient. We introduce a new data type, ByteString, based on lazy lists of byte arrays, combining the speed benefits of strict arrays with lazy evaluation. Equational transformations based on term rewriting are used to deforest intermediate ByteStrings automatically. We describe novel fusion combinators with improved expressiveness and performance over ...

متن کامل

Exposing Homograph Obfuscation Intentions by Coloring Unicode Strings

Unicode has become a useful tool for information internationalization, particularly for applications in web links, web pages, and emails. However, many Unicode glyphs look so similar that malicious guys may utilize this feature to trick people’s eyes. In this paper, we propose to use Unicode string coloring as a promising countermeasure to this emerging threat. A coloring algorithm is designed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009