توضیحات
ABSTRACT
Data analysts must make sense of increasingly large data sets, sometimes with billions or more records.We present
methods for interactive visualization of big data, following the principle that perceptual and interactive scalability
should be limited by the chosen resolution of the visualized data, not the number of records. We first describe
a design space of scalable visual summaries that use data reduction methods (such as binned aggregation or
sampling) to visualize a variety of data types. We then contribute methods for interactive querying (e.g., brushing
& linking) among binned plots through a combination of multivariate data tiles and parallel query processing. We
implement our techniques in imMens, a browser-based visual analysis system that uses WebGL for data processing
and rendering on the GPU. In benchmarks imMens sustains 50 frames-per-second brushing & linking among
dozens of visualizations, with invariant performance on data sizes ranging from thousands to billions of records
INTRODUCTION
Traditional data visualization tools are often inadequate to handle big data. While it is debatable what is meant by “big”, visualization researchers have regularly used one million or more data cases as a threshold [FP02,UTH06]. More generally, many data sets are too large to fit in memory and may be distributed across a cluster; modern data warehouses often include tables with billions or more records. Most visual analysis tools are not designed to work at this scale, let alone support real-time interaction [KPHH12]. Research on big data visualization must address two major challenges: perceptual and interactive scalability. Given the resolution of conventional displays (~1-3 million pixels), visualizing every data point can lead to over-plotting and may overwhelm users’ perceptual and cognitive capacities. On the other hand, reducing the data through sampling or filtering can elide interesting structures or outliers. Big data also impose challenges for interactive exploration. Querying large data stores can incur high latency, disrupting fluent interaction. Even with data reduction methods like binned aggregation, high dimensionality or fine-grained bins can result in data cubes too large to process in real-time
Year : 2013
Publisher : Blackwell Publishing
By : Zhicheng Liu, Biye Jiangz and Jeffrey Heer
File Information : English Language/10 Page / Size : 3 M
Download : click
سال : 2013
ناشر : Blackwell Publishing
کاری از : Zhicheng Liu, Biye Jiangz and Jeffrey Heer
اطلاعات فایل :زبان انگلیسی / 10 صفحه/ حجم : 3 M
لینک دانلود :روی همین لینک کلیک کنید
نقد و بررسیها
هنوز بررسیای ثبت نشده است.