Wednesday 23 August 2017

Neurobayes perdagangan jangka pendek sistematis


6-DHLBigData (1).pdf - DATA BESAR DALAM LOGISTIK DHL. Ini adalah akhir pratinjau. Daftar untuk mengakses sisa dokumen. Pratinjau teks yang tidak diformat: DATA BESAR DALAM LOGISTIK Perspektif DHL tentang bagaimana untuk bergerak melampaui hype Desember 2013 Powered by Solutions amp Inovasi: Trend Research PUBLISHER Solusi Pelanggan DHL amp Inovasi Diwakili oleh Martin Wegner Vice President Solutions amp Innovation 53844 Troisdorf, Jerman DIREKTUR PROYEK Dr Markus Kckelhaus Solutions amp Inovasi, MANAJEMEN PROYEK DHL DAN KANTOR EDITORIAL Solusi Katrin Zeiler Inovasi, DHL DALAM KERJASAMA DENGAN: PENULIS Martin Jeske, Moritz Grner, Frank Wei Kata Pengantar KATA PENGANTAR Besar Data dan logistik dibuat satu sama lain, dan hari ini industri logistik Adalah memposisikan diri untuk menempatkan kekayaan informasi ini agar lebih baik digunakan. Potensi Data Besar di industri logistik telah disorot dalam Radar Logistik DHL yang diakui. Studi menyeluruh ini adalah dokumen dinamis dan dinamis yang dirancang untuk membantu organisasi mendapatkan strategi baru dan mengembangkan proyek dan inovasi yang lebih kuat. Data Big banyak menawarkan dunia logistik. Analisis data yang canggih dapat mengkonsolidasikan sektor tradisional yang terfragmentasi ini, dan kemampuan baru ini menempatkan penyedia logistik di posisi terdepan sebagai mesin pencari di dunia fisik. Ini telah dikembangkan bersama dengan T-Systems dan para ahli dari Detecon Consulting. Tim peneliti telah menggabungkan pengalaman kelas dunia baik dari domain logistik maupun domain manajemen informasi. Bisakah kita beralih dari data sumur yang dalam ke eksploitasi yang mendalam Kami berharap Data Besar di Logistik memberi Anda beberapa perspektif dan gagasan baru yang kuat. Terima kasih telah memilih bergabung dengan kami dalam perjalanan Data Besar ini bersama-sama, kita semua bisa mendapatkan keuntungan dari model kerjasama dan kolaborasi baru di industri logistik. Bisakah kita menggunakan informasi untuk meningkatkan efisiensi operasional dan pengalaman pelanggan, dan membuat model bisnis baru yang berguna Hormat kami, Untuk mempertajam fokus, laporan tren yang sedang Anda baca sekarang mengajukan pertanyaan Big Data yang penting: Big Data adalah aset yang relatif belum dimanfaatkan Perusahaan dapat memanfaatkan sekali mereka mengadopsi pergeseran pola pikir dan menerapkan teknik pengeboran yang tepat. Ini juga berjalan jauh melampaui kata-kata buzz untuk menawarkan kasus penggunaan di dunia nyata, mengungkapkan apa yang terjadi sekarang, dan apa yang mungkin terjadi di masa depan. Laporan tren ini dimulai dengan pengenalan konsep dan makna Big Data, memberikan contoh yang diambil dari berbagai industri, dan kemudian menyajikan kasus penggunaan logistik. Martin Wegner Dr. Markus Kckelhaus 1 2 Daftar Isi Kata Pengantar. 1 1 Memahami Data Besar. 3 2 Praktik Terbaik Data Besar di Seluruh Industri. 6 2.1 Efisiensi Operasional. 2.2 Pengalaman Pelanggan. 10 2.3 Model Bisnis Baru. 13 3 Data Besar dalam Logistik. 15 3.1 Logistik sebagai Bisnis Berbasis Data. 15 3.2 Use Case Operational Efficiency. 18 3.3 Use Case Customer Experience. 22 3.4 Use Case Model Bisnis Baru. 25 3.5 Faktor Keberhasilan untuk Melaksanakan Analisis Data Besar. 27 Outlook. 29 Memahami Data Besar 1 MEMAHAMI DATA BESAR Keberhasilan sukses dari pembangkit tenaga listrik Internet seperti Amazon, Google, Facebook, dan eBay memberikan bukti faktor produksi keempat di dunia yang terhubung dengan masa depan. Selain sumber daya, tenaga kerja, dan modal, tidak diragukan lagi bahwa informasi menjadi penting di alam semesta1, berkat pertumbuhan media sosial, akses jaringan yang beragam, dan jumlah perangkat cerdas yang terus meningkat. Dunia digital hari ini berkembang dengan kecepatan yang menggandakan volume data setiap dua tahun2 (lihat Gambar 1). Unsur diferensiasi kompetitif. Perusahaan di setiap sektor melakukan upaya untuk perdagangan permen karet untuk mendapatkan informasi yang akurat melalui data guna mencapai pengambilan keputusan bisnis yang efektif. Tidak masalah masalah yang harus diputuskan diantisipasi volume penjualan, preferensi produk pelanggan, jadwal kerja yang dioptimalkan itu adalah data yang sekarang memiliki kekuatan untuk membantu bisnis sukses. Seperti pencarian minyak, dengan Big Data dibutuhkan pengeboran terdidik untuk mengungkapkan informasi berharga. Mengapa pencarian informasi yang bermakna begitu rumit karena besarnya pertumbuhan data yang ada di dalam perusahaan dan di Internet publik. Pada tahun 2008, jumlah potongan informasi digital yang tersedia (bit) melampaui jumlah bintang. Selain pertumbuhan eksponensial ini, dua karakteristik data lebih lanjut telah berubah secara substansial. Pertama, data mengalir. Penyebaran perangkat terhubung yang sangat besar seperti mobil, smartphone, pembaca RFID, Webcam, dan jaringan sensor menambahkan sejumlah besar sumber data otonom. Perangkat seperti ini terus menghasilkan aliran data tanpa campur tangan manusia, meningkatkan kecepatan pengumpulan dan pemrosesan data. Kedua, data sangat bervariasi. Sebagian besar data yang baru dibuat berasal dari gambar kamera, video dan cuplikan pengintaian, entri blog, diskusi forum, dan katalog e-commerce. Semua sumber data tidak terstruktur ini berkontribusi pada variasi tipe data yang jauh lebih tinggi. 40.000 30.000 (Exabytes) 20.000 10.000 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Gambar 1: Pertumbuhan data eksponensial antara tahun 2010 dan 2020 Sumber: Studi Universe Digital IDC, disponsori oleh EMC, Desember 2012 Alam Semesta yang Tercerahkan dan Exploding, IDC , 2008 1 Semesta Digital di tahun 2020: Big Data, Bigger Digital Shadows, dan Pertumbuhan Terbesar di Timur Jauh, IDC, disponsori oleh EMC, Desember 2012 2 3 4 Memahami Data Besar Telefonica harus menjawab dalam perjalanan untuk akhirnya meluncurkan Smart-nya. Langkah-langkah layanannya adalah: Nilai tambahan apa yang dimiliki oleh sebagian besar data dan bagaimana kita memanfaatkannya. Sementara konsumen terbiasa membuat keputusan tentang keputusan hidup sehari-hari seperti pembelian, perencanaan rute, atau mencari tempat untuk makan, perusahaan tertinggal . Untuk memanfaatkan aset informasinya, perusahaan harus mengubah sikap mereka tentang bagaimana menggunakan data. Di masa lalu, analisis data digunakan untuk mengkonfirmasi keputusan yang telah diambil. Yang dibutuhkan adalah perubahan budaya. Perusahaan harus beralih ke gaya analisis data berwawasan ke depan yang menghasilkan wawasan baru dan jawaban yang lebih baik. Pergeseran pola pikir ini juga menyiratkan kualitas eksperimentasi, kerjasama, dan transparansi baru di seluruh perusahaan. Volume, kecepatan, dan variasi (3Vs) adalah Data Besar ini Dalam literatur, 3V telah banyak dibahas sebagai karakteristik analisis Big Data. Tapi masih banyak yang perlu dipertimbangkan jika bisnis ingin memanfaatkan informasi sebagai faktor produksi dan memperkuat posisi kompetitif mereka. Yang dibutuhkan adalah pergeseran pola pikir dan penerapan teknik pengeboran yang tepat. Menjadi Bisnis Berbasis Informasi Ketika penyedia telekomunikasi global, Telefonica mulai mengeksplorasi model bisnis berbasis informasi, perusahaan tersebut telah mampu memproses ratusan juta data dari jaringan seluler setiap harinya untuk merutekan dan menagih panggilan telepon dan layanan data. . Dengan demikian, penanganan volume data yang sangat besar dengan kecepatan tinggi bukanlah masalah utama. Sebagai gantinya, pertanyaan kunci Seiring dengan transisi ini, prasyarat lain untuk menjadi bisnis berbasis informasi adalah membangun seperangkat keterampilan sains data tertentu. Ini mencakup penguasaan spektrum prosedur analitis yang luas dan memiliki pemahaman bisnis yang komprehensif. Dan perusahaan harus mengambil pendekatan teknologi baru untuk mengeksplorasi informasi dengan urutan detail dan kecepatan yang lebih tinggi. Paradigma pengolahan data yang mengganggu seperti database memori dan pada akhirnya model komputasi konsisten berjanji untuk memecahkan masalah analisis data berskala besar dengan biaya yang layak secara ekonomi. Setiap perusahaan sudah memiliki banyak informasi. Tapi sebagian besar datanya harus disempurnakan kemudian bisa ditransformasikan menjadi nilai bisnis. Dengan analisis Big Data, perusahaan dapat mencapai sikap, skillset, dan teknologi yang dibutuhkan untuk menjadi kilang data dan menciptakan nilai tambah dari aset informasinya. Memahami Data Logistik dan Data Besar yang Besar adalah Pencocokan Sempurna Sektor logistik ditempatkan secara ideal untuk mendapatkan keuntungan dari kemajuan teknologi dan metodologi Big Data. Sebuah petunjuk kuat bahwa penguasaan data selalu menjadi kunci disiplin adalah bahwa, di dalam akar Yunani kunonya, logistik berarti aritmatika praktis.3 Penyedia logistik hari ini mengelola arus barang yang besar dan pada saat bersamaan membuat kumpulan data yang luas. Untuk jutaan pengiriman setiap hari, asal dan tujuan, ukuran, berat, konten, dan lokasi semuanya dilacak di seluruh jaringan pengiriman global. Tapi apakah data pelacakan ini sepenuhnya mengeksploitasi nilai Mungkin tidak. Kemungkinan besar ada potensi besar yang belum dimanfaatkan untuk meningkatkan efisiensi operasional dan pengalaman pelanggan, dan menciptakan model bisnis baru yang bermanfaat. Pertimbangkan, misalnya, manfaat mengintegrasikan aliran data rantai pasokan dari beberapa penyedia logistik ini dapat menghilangkan fragmentasi pasar saat ini, yang memungkinkan kolaborasi dan layanan baru yang hebat. Banyak penyedia menyadari bahwa Big Data adalah tren yang menggembirakan bagi industri logistik. Dalam sebuah studi terbaru mengenai tren supply chain, enam puluh persen responden menyatakan bahwa mereka berencana untuk berinvestasi dalam analisis Big Data dalam lima tahun ke depan4 (lihat Gambar 2 di bawah). Namun, pencarian keunggulan kompetitif dimulai dengan identifikasi kasus penggunaan Data Besar yang kuat. Dalam tulisan ini, pertama-tama kita melihat organisasi yang telah berhasil menerapkan analisis Big Data dalam konteks industri mereka sendiri. Kemudian, kami menyajikan sejumlah kasus penggunaan khusus untuk sektor logistik. Jaringan Sosial (Internal, B2B) Platform Analisis Bisnis sebagai Layanan Saat Ini Lima Tahun Merancang ulang Jaringan Perangkat Lunak Sistem Manajemen Siklus Hidup Produk 0 10 20 30 40 50 60 70 Gambar 2: Area investasi saat ini dan yang direncanakan untuk teknologi Big Data. Sumber: Tren dan Strategi dalam Logistik dan Manajemen Rantai Pasokan, hal. 51, BVL International, 2013 Definisi dan pengembangan, Logistik Baden-Wrttemberg, lih. Logistik-bw. deDefinition.411M52087573ab0.0.html 3 Tren dan Strategi dalam Manajemen Rantai Pasokan dan Logistik, BVL Internasional, 2013 4 5 6 Praktik Terbaik Data Besar di Seluruh Industri 2 DATA BESAR PRAKTEK TERBAIK INDUSTRI RAKYAT Memanfaatkan nilai aset informasi adalah Tujuan strategis baru bagi kebanyakan perusahaan dan organisasi. Terlepas dari kekuatan Internet yang telah berhasil mengembangkan model bisnis berbasis informasi, perusahaan di sektor lain biasanya pada tahap awal mengeksplorasi bagaimana memanfaatkan tumpukan data mereka, dan memanfaatkan data ini untuk penggunaan yang baik. Menurut penelitian terakhir5, hanya 14 perusahaan Eropa yang telah menangani analisis Big Data sebagai bagian dari perencanaan strategis mereka (lihat Gambar 3). Namun hampir setengah dari perusahaan-perusahaan ini mengharapkan pertumbuhan data tahunan dalam organisasi mereka lebih dari 25. Yang pertama dan paling jelas adalah efisiensi operasional. Dalam hal ini, data digunakan untuk membuat keputusan yang lebih baik, mengoptimalkan konsumsi sumber daya, dan memperbaiki kualitas dan kinerja proses. Pengolahan data otomatis apa yang selalu disediakan, namun dengan seperangkat kemampuan yang disempurnakan. Dimensi kedua adalah pengalaman pelanggan yang khas bertujuan untuk meningkatkan loyalitas pelanggan, melakukan segmentasi pelanggan yang tepat, dan mengoptimalkan layanan pelanggan. Termasuk sumber data yang luas dari Internet publik, Big Data mendorong teknik CRM ke tahap evolusioner berikutnya. Ini juga memungkinkan model bisnis baru untuk melengkapi arus pendapatan dari produk yang ada, dan untuk menciptakan pendapatan tambahan dari produk (data) yang sama sekali baru. Dimensi Nilai Besar Data Ketika perusahaan mengadopsi Big Data sebagai bagian dari strategi bisnis mereka, pertanyaan pertama yang muncul di permukaan biasanya adalah jenis nilai yang akan diberikan oleh Data Big Akan berkontribusi pada garis atas atau bawah, atau akankah ada pengandar non-keuangan Dari sudut pandang nilai, penerapan analisis Big Data termasuk salah satu dari tiga dimensi (lihat Gambar 4). Untuk masing-masing dimensi nilai Big Data ini, semakin banyak aplikasi yang menarik. Ini memamerkan potensi bisnis untuk menghasilkan informasi melalui spektrum pasar vertikal yang luas. Pada bagian berikut, kami menyajikan beberapa kasus penggunaan untuk menggambarkan bagaimana penggerak awal telah memanfaatkan sumber data dengan cara inovatif, dan akibatnya menciptakan nilai tambah yang signifikan. Apakah Perusahaan Anda Menetapkan Strategi Data yang Besar Apakah Perusahaan Anda Menetapkan Strategi Data yang Besar No 63 23 Direncanakan Ya 14 Gambar 3: Data Besar sebagai tujuan strategis di perusahaan-perusahaan Eropa Statistik dari studi BARC (N 273) Sumber: Survei Data Besar Eropa, BARC , Februari 2013, hal.17 Survei Data Besar di Eropa, BARC-Institute, Februari 2013 5 Praktik Terbaik Data Besar Di Seluruh Industri Efisiensi Operasional Operasional Pengalaman Pelanggan Pelanggan Gunakan data ke: untuk: Penggunaan data Eksploitasi untuk: Mengeksploitasi data data untuk: Meningkatkan pelanggan Meningkatkan loyalitas pelanggan dan retensi retensi Melakukan segmentasi pelanggan dengan tepat Melakukan segmentasi pelanggan pelanggan dan segmentasi dan target penargetan Optimalkan pelanggan Mengoptimalkan interaksi interaksi antara pelanggan dan layanan dan layanan Efisiensi Meningkatkan tingkat kenaikan Tingkat transparansi Transparansi Optimalkan Sumber Daya Optimalkan Konsumsi Sumber Daya Meningkatkan Konsumsi dan Peningkatan Mutu Produk Kualitas proses dan kinerja Ex Perhatian Model Baru Model Bisnis Bisnis Baru Memanfaatkan data ondataoleh: oleh: Memanfaatkan perluasan arus pendapatan Memperluas arus pendapatan dari yang ada dari produk produk yang ada Menciptakan pendapatan baru Menciptakan aliran baru dari keseluruhan aliran dari produk baru (data) produk (data) baru Gambar 4: Dimensi nilai untuk kasus penggunaan Data Besar Sumber: Deteksi DPDHL 2.1 Efisiensi Operasional 2.1.1 Memanfaatkan data untuk memprediksi hotspot kejahatan Untuk departemen kepolisian metropolitan, tugas melacak penjahat untuk melestarikan keselamatan publik terkadang bisa membosankan. Dengan banyak penyimpanan informasi silase, kasus kerja sering melibatkan pembuatan koneksi manual dari banyak titik data. Ini membutuhkan waktu dan secara dramatis memperlambat resolusi kasus. Selain itu, sumber daya kepolisian jalan dikerahkan secara reaktif, sehingga sangat sulit untuk menangkap penjahat dalam tindakan tersebut. Dalam kebanyakan kasus, tidak mungkin menyelesaikan tantangan ini dengan meningkatkan penempatan polisi, karena anggaran pemerintah terbatas. Salah satu otoritas yang memanfaatkan berbagai sumber data adalah Departemen Kepolisian New York (NYPD). Dengan menangkap dan menghubungkan bagian-bagian informasi terkait kejahatan, ia berharap untuk tetap selangkah lebih maju dari pelaku kejahatan.6 Jauh sebelum istilah Big Data diciptakan, NYPD berusaha untuk memecah kompartementalisasi ingest data (mis. Data dari 911 panggilan, laporan investigasi, dan lainnya). Dengan satu pandangan dari semua informasi yang terkait dengan satu kejahatan tertentu, petugas mendapatkan gambaran kasus kasus mereka yang lebih koheren dan real-time. Pergeseran ini secara signifikan mempercepat analisis retrospektif dan memungkinkan NYPD untuk melakukan tindakan sebelumnya dalam melacak penjahat individual. Tingkat kejahatan kekerasan yang terus menurun di New York7 telah dikaitkan tidak hanya dengan penyederhanaan data dari banyak item yang dibutuhkan untuk melakukan pekerjaan perkara namun juga pada perubahan mendasar dalam praktik kepolisian.8 Dengan memperkenalkan analisis statistik dan pemetaan georitis mengenai tempat-tempat kejahatan , NYPD telah mampu menciptakan gambaran yang lebih besar untuk memandu penyebaran sumber daya dan praktik patroli. Sekarang departemen dapat mengenali pola kejahatan menggunakan analisis komputasi, dan ini memberikan wawasan yang memungkinkan setiap perwira komandan untuk secara proaktif mengidentifikasi titik-titik aktivitas kriminal. NYPD mengubah persamaan pengendalian kejahatan dengan cara menggunakan informasi, IBM cf. Www-01.ibmsoftwaresuccesscssdb. nsfCSJSTS-6PFJAZ 6 Indeks Kejahatan Menurut Wilayah, Divisi Peradilan Pidana Negara Bagian New York, Mei 2013, lih. Criminaljustice. ny. govcrimnetojsastats. htm 7 Compstat dan Perubahan Organisasi di Departemen Kepolisian Lowell, Willis dkk. Al. Yayasan Polisi, 2004 lih. Policefoundation. org 8 contentcompstat-and-organization-change-lowell-police-department 7 8 Praktik Terbaik Data Besar di Seluruh Industri Perspektif antisipatif ini menempatkan NYPD dalam posisi untuk secara efektif menargetkan penyebaran tenaga kerja dan sumber daya. Dalam kombinasi dengan tindakan lain, analisis sistematis terhadap informasi yang ada telah berkontribusi terhadap tingkat kejahatan kekerasan yang terus menurun (lihat Gambar 5). Teknik menggunakan data historis untuk mencapai pengenalan pola dan oleh karena itu memprediksi tempat-tempat kejahatan telah lama berlalu, telah diadopsi oleh sejumlah kotamadya di Amerika Serikat. Semakin banyak departemen kepolisian menawarkan informasi pencatatan kejahatan kepada publik, pihak ketiga juga telah mulai Memberikan prediksi titik kriminal mereka mengumpulkan data ke dalam pandangan nasional dan juga menyediakan fungsi tip anonim (lihat Gambar 6) .9 26.000 1.000 24.000 -3 900 22.000 800 20.000 700 18.000 -4 600 Perampokan 16.000 500 14.000 400 12.000 300 10.000 2002 Pembunuhan 2004 2006 2008 2010 2012 Gambar 5: Perkembangan kejahatan kekerasan di New York City yang diambil dari Indeks Kejahatan yang Dilaporkan ke Polisi oleh Wilayah: New York City, 20032012, Sumber: Divisi Kehakiman Negara Bagian New York, lih. Criminaljustice. ny. govcrimnetojsastats. htm Gambar 6: Screenshot Crimereports mesin publik, lih. Kriminal Cf. Publicerines (teladan) 9 Praktik Terbaik Data Besar di Seluruh Industri 2.1.2 Pergeseran perencanaan yang optimal di toko ritel Bagi manajer toko eceran, perumusan perencanaan untuk memenuhi permintaan pelanggan merupakan tugas yang sensitif. Overstaffing toko menciptakan biaya yang tidak perlu dan menurunkan profitabilitas situs. Menjalankan toko dengan tingkat staf yang rendah berdampak negatif pada kepuasan pelanggan dan karyawan. Keduanya buruk untuk bisnis. Di toko obat DM, tugas perencanaan pergeseran secara historis dilakukan oleh manajer toko berdasarkan ekstrapolasi sederhana dan pengalaman pribadi. Untuk hari kerja reguler, proses ini cukup baik. Namun dengan semakin banyaknya pengecualian, hal itu menjadi tidak memuaskan. Overhead atau kekurangan personil terbatas kinerja toko. Jadi DM bertekad untuk secara efektif membantu manajer toko dalam perencanaan personalia mereka ke depan dengan menemukan cara untuk memperkirakan permintaan secara andal pada setiap titik penjualan tertentu.10 Pendekatannya adalah untuk menerapkan prediksi pendapatan toko harian jangka panjang, dengan mempertimbangkan berbagai macam Parameter individu dan lokal. Input data ke algoritma baru termasuk data pendapatan historis, jam buka, dan waktu kedatangan barang baru dari pusat distribusi. Di atas ini, data lain tertelan untuk mencapai tingkat presisi tertinggi. Data ini termasuk keadaan lokal seperti hari pasar, liburan di lokasi tetangga, pengalihan jalan, dan data perkiraan cuaca di masa depan (karena kondisi cuaca berdampak signifikan pada perilaku konsumen). DM mengevaluasi algoritma prediktif yang berbeda, dan solusi yang dipilih sekarang memberikan proyeksi akurat seperti itu sehingga terbukti merupakan dukungan kuat untuk perencanaan pergeseran. Berdasarkan prediksi penjualan harian yang tinggi untuk masing-masing toko, karyawan sekarang dapat memasukkan preferensi pribadi mereka ke dalam jadwal shift empat sampai delapan minggu sebelumnya. Setelah disetujui, pergeseran mereka tidak mungkin berubah, mereka dapat bergantung pada rencana jangka panjang, dan perubahan pada menit terakhir adalah peristiwa yang luar biasa. Ini menunjukkan bagaimana menerapkan analisis prediktif di DM meningkatkan efisiensi operasional di toko dan, pada saat yang sama, berkontribusi pada keseimbangan kehidupan kerja yang lebih baik bagi personil toko. Business Intelligence Guide 20122013, isreport, isi Medien Mnchen, atau cf. Blue-yonderendm-drogerie-markt-en. html 10 9 2 2006 Q4 2007 10 Praktik Terbaik Data Besar di Seluruh Industri 2.2 Pengalaman Pelanggan 2.2.1 Analisis pengaruh sosial untuk retensi pelanggan Untuk mendapatkan pemahaman tentang kepuasan pelanggan dan permintaan masa depan, perusahaan menggunakan nomor Dari model bisnis yang berbeda. Pendekatan konvensional adalah melakukan riset pasar terhadap basis pelanggan, namun ini menciptakan pandangan umum tanpa fokus pada kebutuhan dan perilaku konsumen individual. Masalah yang menantang penyelenggara telekomunikasi adalah masalah pelanggan (kehilangan pelanggan selama periode waktu tertentu). Untuk membantu mengurangi churn, organisasi biasanya menganalisis pola penggunaan pelanggan individual dan kualitas layanan mereka sendiri. Mereka juga menawarkan penghargaan khusus11 untuk mempertahankan beberapa pelanggan setia, berdasarkan parameter seperti belanja pelanggan, penggunaan, dan panjang langganan. Di masa lalu, upaya retensi ini berdasarkan pada nilai pelanggan individual telah mencapai beberapa peningkatan loyalitas12, namun churn pelanggan tetap menjadi masalah bagi penyedia layanan (lihat Gambar 7). Untuk lebih meramalkan perilaku pelanggan, T-Mobile USA mulai memasukkan hubungan sosial antara pelanggan dalam model manajemen churn.13 Rumah tangga menggunakan teknik multi-grafik, serupa dengan metode yang digunakan. Membuat perspektif yang sama sekali baru dari pelanggan yang dibutuhkannya T - Mobile untuk memperkaya analisis data warisannya (secara historis diambil dari sistem penagihan dan elemen jaringan komunikasi). Selain itu, sekitar satu petabyte data mentah termasuk informasi dari clickstreams web dan jejaring sosial sekarang dicerna untuk membantu melacak mekanisme canggih di balik churn pelanggan. Pendekatan yang sangat inovatif ini telah membuahkan hasil bagi T-Mobile. Setelah hanya kuartal pertama menggunakan model pengelolaan churn baru, tingkat churn organisasi turun sebesar 50 dibandingkan dengan kuartal yang sama tahun sebelumnya. Pascabayar Prabayar Blended Pascabayar tren Kecenderungan dibayar kembali Kecenderungan campuran 6 5 Tingkat Churn () dalam analisis jaringan sosial, untuk mengidentifikasi apa yang disebut pemimpin suku. Ini adalah orang-orang yang memiliki pengaruh kuat dalam kelompok yang lebih besar dan terhubung. Jika seorang pemimpin suku beralih ke layanan pesaing, kemungkinan sejumlah teman dan anggota keluarga mereka juga akan beralih seperti efek domino. Dengan perubahan cara menghitung nilai pelanggan, T-Mobile telah meningkatkan ukurannya untuk memasukkan bukan hanya belanja pelanggan seumur hidup untuk layanan seluler tetapi juga ukuran jaringan sosial atau sukunya (lihat Gambar 8). 4 3 2 1 0 Q2 2005 Q4 2005 Q2 2006 Pembayaran Pascabayar Prabayar Q4 2006 Q2 2007 Kecenderungan Dibayar Bayar Pasca Bayar Tren Prabayar Kecenderungan Blended Q4 2007 Q2 2008 Gambar 8: Identifikasi influencer dalam basis pelanggan seluler Pascabayar Bayar Prabayar Blended Pascabayar tren Kecenderungan dibayar di muka Gambar 7: Perkembangan Blended trend tingkat churn pelanggan, dari: Mobile Churn and Loyalty Strategies, Informa, hlm. 24 Pelelangan Loyalitas Pelanggan, Informa, 2012 11 Q4 2006 12 2007 Q2 2007 MobileQ4Churn Q2 Kesetiaan 2008 dan Strategi, Edisi ke-2, Informa, 2009 T-Mobile tantangan terkait dengan data, Brett Sheppard, OReilly Strata, 2011 lih. Str1.oreilly201108t-mobile-challenge-churn-with. html 13 Q2 2008 Praktik Terbaik Data Besar di Seluruh Industri 2.2.2 Menghindari kondisi persediaan untuk kepuasan pelanggan Ini adalah pengalaman yang sering dan mengecewakan bagi pembeli: setelah mereka menemukan barang yang sempurna dari Pakaian, mereka menemukan bahwa ukuran yang mereka butuhkan tidak ada habisnya. Dengan meningkatnya persaingan di segmen tekstil dan pakaian, ketersediaan pakaian populer saat ini biasanya terbatas. Hal ini disebabkan oleh konsolidasi merek dan percepatan siklus produk. Dalam beberapa kasus, hanya ada tiga minggu di antara desain pakaian dan kedatangan in-store pertama.14 Peluncuran koleksi baru yang sering, yang didorong oleh rantai yang diatur secara vertikal, mempersempit pengadaan barang ke satu batch. Hal ini menimbulkan risiko terhadap rantai pakaian jadi, sehingga lebih penting daripada sebelumnya untuk secara tepat mengantisipasi permintaan konsumen akan barang tertentu. Kemampuan untuk memprediksi dengan benar permintaan telah menjadi faktor kunci bagi bisnis yang menguntungkan. Peritel multichannel Otto Group menyadari bahwa metode peramalan prakiraan untuk item katalog online dan mail-order terbukti tidak memadai dalam lingkungan yang semakin kompetitif. Untuk 63 item, penyimpangan (dibandingkan dengan volume penjualan aktual) melebihi sekitar 20.15 Kelompok tersebut menghargai risiko bisnis akibat kelebihan produksi dan kekurangan. Overproduksi akan mempengaruhi profitabilitas dan mengunci terlalu banyak modal. Kekurangan akan mengganggu pelanggan. Untuk memenuhi permintaan pelanggan terutama harapan tinggi penduduk asli digital saat melakukan pembelian secara online, Otto Group mengambil pendekatan inovatif dan mengganggu untuk meningkatkan kemampuannya dalam memasok (lihat Gambar 9). Perbedaan prediksi 63 peramalan penyimpangan gt 20 1000 500 Frekuensi absolut Prediksi klasik yang mengembangkan risiko merchandising 100 20 0 20 Prediksi klasik Neuro Bayes mengembangkan risiko keterbelakangan 11 peramalan penyimpangan gt 20 100 200 Prediksi dengan Neuro Bayes Gambar 9: Deviasi relatif prognosis dari volume penjualan aktual, Dari: Big Data amp Prediktif Analytics Der Nutzen von Daten fr przise Prognosen und Entscheidungen in der Zukunft, Otto Group, Konferensi Michael Sinn Bicara Data Besar Eropa, Zurich, 28 Agustus 2012 Mrkte in der globalen Modeindustrie, Patrik Aspers, Year Book 20072008, Institut Max Planck untuk Studi Masyarakat 14 Otto rechnet mit knstlicher Intelligenz, Lebensmittel Zeitung, 21 Agustus 2009 15 11 12 Praktik Terbaik Data Besar di Industri 70 63 60 50 40 30 20 11 10 0 Perkiraan permintaan konvensional Perkiraan permintaan dengan analisis prediktif Gambar 10: Persentase item katalog dengan angka penjualan aktual menyimpang lebih dari 20 dari perkiraan permintaan. Sumber: Perfektes Bestandsmanagement durch Analisis Prediktif, Mathias Stben, Otto Group, pada Kongres Logistik Jerman ke-29, Oktober 2012 Setelah mengevaluasi serangkaian solusi untuk menghasilkan prediksi volume penjualan yang stabil, Grup Otto akhirnya berhasil menerapkan metode yang berasal dari Bidang fisika berenergi tinggi. Ini menggunakan alat analisis multivariat yang menggunakan kemampuan belajar mandiri dari teknik jaringan syaraf tiruan dan menggabungkannya dengan statistik Bayesian.16 Dengan alat analisis ini, kelompok tersebut membentuk mesin peramalan yang sama sekali baru, yang melatih alat ini dengan data historis dari 16 musim sebelumnya, dan Terus masukan ke alat tersebut dengan 300 juta catatan transaksi per minggu dari musim sekarang. Sistem baru ini menghasilkan lebih dari satu miliar perkiraan individu per tahun, dan telah memberikan hasil yang meyakinkan. Dengan hanya 11 item katalog yang kehilangan prediksi penjualan lebih dari 20 (lihat Gambar 10), Grup Otto sekarang lebih mampu memenuhi permintaan pelanggan.17 Pada saat yang sama, pendekatan prediktif baru ini menurunkan penahanan saham, yang menghasilkan peningkatan profitabilitas. Dan ketersediaan dana. Cf. Neurobayes. phi-t. deindex. phppublic-information 16 Treffsichere Absatzprognose mit Prediktif Analytics, Michael Sinn, Konferensi Bicara tentang Data Besar amp Analytics Kongress, Cologne, 19 Juni 2012 17 lih. YoutubewatchvhAE2Mui5lRA Data Besar Praktik Terbaik di Seluruh Industri 2.3 Model Bisnis Baru 2.3.1 Analisis kerumitan memberikan wawasan ritel dan iklan Untuk menyediakan layanan suara dan data seluler yang efektif, operator jaringan harus terus-menerus menangkap satu set data pada setiap pelanggan. Selain mencatat penggunaan layanan mobile (untuk tujuan akuntansi dan penagihan), operator juga harus mencatat setiap lokasi pelanggan sehingga dapat mengarahkan panggilan dan aliran data ke menara seluler tempat handset pelanggan terhubung. Inilah bagaimana setiap pelanggan menciptakan jejak digital saat mereka bergerak di sekitar jaringan penyedia. Dan di sebagian besar negara, ini hanyalah sekelompok kecil operator jaringan yang telah menangkap sebagian besar populasi karena pelanggan jalur digital gabungan mereka dari basis pelanggan memberikan gambaran menyeluruh tentang masyarakat atau, lebih tepatnya, bagaimana masyarakat bergerak. Misalnya, kemungkinan untuk menilai daya tarik jalan tertentu untuk membuka toko baru, berdasarkan analisis resolusi tinggi tentang bagaimana orang bergerak dan beristirahat di area ini, dan menemukan jam buka yang mungkin menghasilkan langkah kaki maksimum (lihat Gambar 11) . Dalam konteks yang lebih luas, juga memungkinkan untuk melihat dampak peristiwa seperti kampanye pemasaran dan pembukaan toko pesaing dengan menganalisis perubahan pola pergerakan. Ketika kelompok gender dan kelompok usia termasuk dalam data, dan kumpulan data geo-lokal dan aktivitas jaringan sosial disertakan, segmentasi ini menambahkan nilai yang lebih besar lagi bagi pengecer dan pengiklan. Di masa lalu, organisasi hanya dapat memanfaatkan lokasi dan penggunaan data dari jaringan bergerak secara internal. Hal ini karena undang-undang privasi yang membatasi eksploitasi informasi pelanggan individual. Tapi begitu identitas pelanggan terpecah dari data pergerakan, nilai bisnis yang substansial tetap ada dalam data kerumunan anonim ini, seperti yang ditemukan Telefonica. Dengan diluncurkannya divisi bisnis global Telefonica Digital, operator jaringan sekarang mendorong inovasi bisnis di luar unit bisnis dan merek intinya. Sebagai bagian dari Telefonica Digital, inisiatif Wawasan Dinamis telah mengkomersialkan analisis data pergerakan, menciptakan pendapatan tambahan dari pelanggan ritel, properti, rekreasi, dan media.18 Operator lain telah mengembangkan penawaran serupa, seperti layanan Wawasan Presisi Pasar Presisi.19 Dalam Daerah perkotaan, kepadatan jalur digital cukup tinggi untuk menghubungkan perilaku kolektif kerumunan pelanggan dengan karakteristik lokasi atau area tertentu. Gambar 11: Analisis footfall pelanggan di lokasi tertentu berdasarkan data pelanggan seluler, dari blog. telefonicacress-releasetelefonica-dynamic-insight-launches-smart-steps-in-the-uk Cf. Dynamicinsights. telefonica 18 Cf. Verizonenterpriseindustryretailprecision-market-insight 19 13 14 Praktik Terbaik Data Besar di Seluruh Industri 2.3.2 Menciptakan produk asuransi baru dari data geo-lokal Kepekaan iklim adalah karakteristik industri pertanian, karena suhu lokal, jam sinar matahari, dan tingkat curah hujan secara langsung berdampak pada hasil panen . Dengan meningkatnya kondisi cuaca ekstrem akibat pemanasan global, variasi iklim telah menjadi risiko yang besar bagi petani.20 Untuk mengurangi dampak penurunan panen, petani mengambil kebijakan asuransi untuk menutupi potensi kerugian finansial mereka. Perusahaan asuransi pada gilirannya ditantang dengan cuaca lokal yang semakin tidak menentu. Di satu sisi, model risiko konvensional berdasarkan data historis tidak lagi sesuai untuk mengantisipasi kerugian yang diasuransikan di masa depan.21 Di sisi lain, klaim harus dikontrol dengan lebih akurat karena kerusakan mungkin berbeda di wilayah yang terkena dampak. Bagi petani, kombinasi kedua aspek ini menghasilkan tingkat asuransi yang lebih tinggi dan pembayaran klaim kerusakan yang lebih lambat. In the United States, most private insurance companies viewed crop production as too risky to insure without federal subsidies.22 In 2006, The Climate Corporation started out to create a new weather simulation model based on 2.5 million temperature and precipitation data points, combined with 150 million soil observations. The high resolution of its simulation grid allows the company to dynamically calculate the risk and pricing for weather insurance coverage on a per-field basis across the entire country (see Figure 12). As the tracking of local growing conditions and the calculation of crop shortfall are performed in real time, payouts to policy holders are executed automatically when bad weather conditions occur. This eliminates the need for sophisticated and time-consuming claims processes. Based on 10 trillion simulation data points23, The Climate Corporations new insurance business model is now successfully established. After only six years, the organizations insurance services have been approved across all 50 states in the U. S. Figure 12: Real-time tracking of weather conditions and yield impact per field screenshot taken from climateproductsclimate-apps Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation, Chapter 4.3.4, Intergovernmental Panel on Climate 20 Change (IPCC), 2012 cf. ipcc. chpdfspecial-reportssrexSREXFullReport. pdf Warming of the Oceans and Implications for the (Re-)Insurance Industry, The Geneva Association, June 2013 21 Weather Insurance Reinvented, Linda H. Smith, DTN The Progressive Farmer, November 2011 cf. dtnprogressivefarmer 22 About us, The Climate Corporation, cf. climatecompanyabout 23 Big Data in Logistics 3 BIG DATA IN LOGISTICS Companies are learning to turn large-scale quantities of data into competitive advantage. Their precise forecasting of market demand, radical customization of services, and entirely new business models demonstrate exploitation of their previously untapped data. As todays best practices touch many vertical markets, it is reasonable to predict that Big Data will also become a disruptive trend in the logistics industry. However, the application of Big Data analytics is not immediately obvious in this sector. The particularities of the logistics business must be thoroughly examined first in order to discover valuable use cases. 3.1 Logistics as a Data-driven Business A kick-start for discussion of how to apply Big Data is to look at creating and consuming information. In the logistics industry, Big Data analytics can provide competitive advantage because of five distinct properties. These five properties highlight where Big Data can be most effectively applied in the logistics industry. They provide a roadmap to the well of unique information assets owned by every logistics provider. In the following sections, we identify specific use cases that exploit the value of this information, and contribute to operational efficiency, a better customer experience, or the development of new business models. Optimization of service properties like delivery time, resource utilization, and geographical coverage is an inherent challenge of logistics 1. Optimization to the core Large-scale logistics operations require data to run efficiently. The earlier this information is available and the more precise the information is, the better the optimization results will become Advanced predictive techniques and real-time processing promise to provide a new quality in capacity forecast and resource control The delivery of tangible goods requires a direct customer interaction at pickup and delivery 2. Tangible goods, tangible customers 3. In sync with customer business On a global scale, millions of customer touch points a day create an opportunity for market intelligence, product feedback or even demographics Big Data concepts provide versatile analytic means in order to generate valuable insight on consumer sentiment and product quality Modern logistics solutions seamlessly integrate into production and distribution processes in various industries The tight level of integration with customer operations let logistics providers feel the heartbeat of individual businesses, vertical markets, or regions The application of analy tic methodology to this comprehensive knowledge reveals supply chain risks and provides resilience against disruptions The transport and delivery network is a high-resolution data source 4. A network of information Apart from using data for optimizing the network itself, network data may provide valuable insight on the global flow of goods The power and diversity of Big Data analytics moves the level of observation to a microeconomic viewpoint Local presence and decentralized operations is a necessity for logistics services 5. Global coverage, local presence A fleet of vehicles moving across the country to automatically collect local information along the transport routes Processing this huge stream of data originating from a large delivery fleet creates a valuable zoom display for demographic, environmental, and traffic statistics 15 16 Big Data in Logistics Big Data in Logistics 17 New Customer Base Big Data in Logistics Shop The Data-driven Logistics Provider 5 Existing Custom er Base Customer Loyalty Management Financial Industry Market and customer intelligence External Online Sources Manufacturing FMCG SME Marketing and Sales Product Management New Business Address Verification Market Intelligence Supply Chain Monitoring Environmental Statistics 11 Environmental Intelligence CO2 Sensors attached to delivery vehicles produce fine-meshed statistics on pollution, traffic density, noise, parking spot utilization, etc. Continuous sensor data Service Improvement and Product Innovation Retail Operations Order volume, received service quality 6 Market Research Commercial Data Services Public customer information is mapped against business parameters in order to predict churn and initiate countermeasures High-tech Pharma Public Authorities Customer sentiment and feedback A comprehensive view on customer requirements and service quality is used to enhance the product portfolio 3 8 Supply chain monitoring data is used to create market intelligence reports for small and medium-sized companies Strategic Network Planning Long-term demand forecasts for transport capacity are generated in order to support strategic investments into the network Commerce Sector 9 Households SME Network flow data Core Market Intelligence for SME Location, traffic density, directions, delivery sequence Tr a n s p o r t N e t w ork Financial Demand and Supply Chain Analytics 1 Real-time Route Optimization Delivery routes are dynamically calculated based on delivery sequence, traffic conditions and recipient status Real-time incidents A micro-economic view is created on global supply chain data that helps financial institutions improve their rating and investment decisions Network flow data 10 2 Location, destination, availability Crowd-based Pickup and Delivery A large crowd of occasionally available carriers pick up or deliver shipments along routes they would take anyway Address Verification Fleet personnel verifies recipient addresses which are transmitted to a c entral address verification service provided to retailers and marketing agencies 4 Operational Capacity Planning Short - and mid-term capacity planning allows optimal utilization and scaling of manpower and resources 7 Risk Evaluation and Resilience Planning By tracking and predicting events that lead to supply chain disruptions, the resilience level of transport services is increased Flow of data Flow of physical goods 2013 Detecon International 18 Big Data in Logistics 3.2 Use Cases Operational Efficiency A straightforward way to apply Big Data analytics in a business environment is to increase the level of efficiency in operations. This is simply what IT has always been doing accelerating business processes but Big Data analytics effectively opens the throttle. 3.2.1 Last-mile optimization A constraint in achieving high operational efficiency in a distribution network occurs at the last mile. 24 This final hop in a supply chain is often the most expensive one. The optimization of last-mile delivery to drive down product cost is therefore a promising application for Big Data techniques. Two fundamental approaches make data analysis a powerful tool for increasing last-mile efficiency. In a first and somewhat evolutionary step, a massive stream of information is processed to further maximize the performance of a conventional delivery fleet. This is mainly achieved by real-time optimization of delivery routes. The second, more disruptive approach utilizes data processing to control an entirely new last-mile delivery model. With this, the raw capacity of a huge crowd of randomly moving people replaces the effectiveness of a highly optimized workforce. 1 Real-time route optimization The traveling salesmen problem was formulated around eighty years ago, but still defines the core challenge for last-mile delivery. Route optimization on the last mile aims at saving time in the delivery process. Rapid processing of real-time information supports this goal in multiple ways. When the delivery vehicle is loaded and unloaded, a dynamic calculation of the optimal delivery sequence based on sensor-based detection of shipment items frees the staff from manual sequencing. On the road, telematics databases are tapped to automatically change delivery routes according to current traffic conditions. And routing intelligence considers the availability and location information posted by recipients in order to avoid unsuccessful delivery attempts. In summary, every delivery vehicle receives a continuous adaptation of the delivery sequence that takes into account geographical factors, environmental factors, and recipient status. What makes this a Big Data problem It requires the execution of combinatorial optimization procedures fed from correlated streams of real-time events to dynamically re-route vehicles on the go. As a result, each driver receives instant driving direction updates from the onboard navigation system, guiding them to the next best point of delivery. DHL SmartTruck Daily optimized initial tour planning based on incoming shipment data Dynamic routing system, which recalculates the routes depending on the current order and traffic situation Cuts costs and improves CO2 efficiency, for example by reducing mileage The term last mile has its origin in telecommunications and describes the last segment in a communication network that actually reaches the 24 customer. In the logistics sector, the last mile is a metaphor for the final section of a supply chain, in which goods are handed over to the recipient. Source: The definition of the first and last miles, DHL Discover Logistics, cf. dhl-discoverlogisticscmsencoursetechnologies reinforcementfirst. jsp Big Data in Logistics 2 Crowd-based pick-up and delivery The wisdom and capacity of a crowd of people has become a strong lever for effectively solving business problems. Sourcing a workforce, funding a startup, or performing networked research are just a few examples of requisitioning resources from a crowd. Applied to a distribution network, a crowd-based approach may create substantial efficiency enhancements on the last mile. The idea is simple: Commuters, taxi drivers, or students can be paid to take over lastmile delivery on the routes that they are traveling anyway. Scaling up the number of these affiliates to a large crowd of occasional carriers effectively takes load off the delivery fleet. Despite the fact that crowd-based delivery has to be incentivized, it has potential to cut last-mile delivery costs, especially in rural and sparsely populated areas. On the downside, a crowd-based approach also issues a vital challenge: The automated control of a huge number of randomly moving delivery resources. This requires extensive data processing capabilities, answered by Big Data techniques such as complex event processing and geocorrelation. A real-time data stream is traced in DHL MyWays order to assign shipments to available carriers, based on their respective location and destination. Interfaced through a mobile application, crowd affiliates publish their current position and accept pre-selected delivery assignments. The above two use cases illustrate approaches to optimizing last-mile delivery, yet they are diametrically opposed. In both cases, massive real-time information (originating from sensors, external databases, and mobile devices) is combined to operate delivery resources at maximum levels of efficiency. And both of these Big Data applications are enabled by the pervasiveness of mobile technologies. Unique crowd-based delivery for B2C parcels Flexible delivery in time and location Using existing movement of city residents myways 19 20 Big Data in Logistics 3.2.2 Predictive network and capacity planning Optimal utilization of resources is a key competitive advantage for logistics providers. Excess capacities lower profitability (which is critical for low-margin forwarding services), while capacity shortages impact service quality and put customer satisfaction at risk. Logistics providers must therefore perform thorough resource planning, both at strategic and operational levels. Strategic-level planning considers the long-term configuration of the distribution network, and operational-level planning scales capacities up or down on a daily or monthly basis. For both perspectives, Big Data techniques improve the reliability of planning and the level of detail achieved, enabling logistics providers to perfectly match demand and available resources. 3 Strategic network planning At a strategic level, the topology and capacity of the distribution network are adapted according to anticipated future demand. The results from this stage of planning usually drive investments with long requisition and amortization cycles such as investments in warehouses, distribution centers, and custom-built vehicles. More precise capacity demand forecasts therefore increase efficiency and lower the risks of investing in storage and fleet capacity. Big Data techniques support network planning and optimization by analyzing comprehensive historical capacity and utilization data of transit points and transportation routes. In addition, these techniques consider seasonal factors and emerging freight flow trends by learning algorithms that are fed with extensive statistical series. External economic information (such as industry-specific and regional growth forecasts) is included for more accurate prediction of specific transportation capacity demand. In summary, to substantially increase predictive value, a much higher volume and variety of information is exploited by advanced regression and scenario modeling techniques. The result is a new quality of planning with expanded forecast periods this effectively reduces the risk of long-term infrastructure investments and contracted external capacities. It can also expose any impending over-capacity and provide this as automated feedback to accelerate sales volume. This is achieved by dynamic pricing mechanisms, or by transfer of overhead capacities to spot-market trading. Big Data in Logistics 4 Operational capacity planning At operational level, transit points and transportation routes must be managed efficiently on a day-to-day basis. This involves capacity planning for trucks, trains, and aircraft as well as shift planning for personnel in distribution centers and warehouses. Often operational planning tasks are based on historical averages or even on personal experience, which typically results in resource inefficiency. Instead, using the capabilities of advanced analytics, the dynamics within and outside the distribution network are modeled and the impact on capacity requirements calculated in advance. Real-time information about shipments (items that are entering the distribution network, are in transit, and are stored) is aggregated to predict the allocation of resources for the next 48 hours. This data is automatically sourced from warehouse management systems and sensor data along the transportation chain. In addition detection of ad-hoc changes in demand is derived from externally available customer information (e. g. data on product releases, factory openings, or unexpected bankruptcy). Additionally, local incidents are detected (e. g. regional disease outbreaks or natural disasters) as these can skew demand figures for a particular region or product. This prediction of resource requirements helps Both of the above Big Data scenarios increase resource efficiency in the distribution network, but the style of data processing is different. The strategic optimization combines a high data volume from a variety of sources in order to support investment and contracting decisions, while the operational optimization continuously forecasts network flows based on real-time streams of data. DHL Parcel Volume Prediction operating personnel to scale capacity up or down in each particular location. But theres more to it than that. A precise forecast also reveals upcoming congestions on routes or at transit points that cannot be addressed by local scaling. For example, a freight aircraft that is working to capacity must leave behind any further expedited shipments at the airport of origin. Simulation results give early warning of this type of congestion, enabling shipments to be reassigned to uncongested routes, mitigating the local shortfall. This is an excellent example of how Big Data analytics can turn the distribution network into a self-optimizing infrastructure. Analytic tool to measure influences of external factors on the expected volume of parcels Correlates external data with internal network data Results in a Big Data Prediction Model that significantly increases operational capacity planning Ongoing research project by DHL Solutions amp Innovation 21 22 Big Data in Logistics 3.3 Use Cases Customer Experience The aspect of Big Data analytics that currently attracts the most attention is acquisition of customer insight. For every business, it is vitally important to learn about customer demand and satisfaction. But as organizations experience increased business success, the individual customer can blur into a large and anonymous customer base. Big Data analytics help to win back individual customer insight and to create targeted customer value. 3.3.1 Customer value management Clearly, data from the distribution network carries significant value for the analysis and management of customer relations. With the application of Big Data techniques, and enriched by public Internet mining, this data can be used to minimize customer attrition and understand customer demand. 5 Customer loyalty management For most business models, the cost of winning a new customer is far higher than the cost of retaining an existing customer. But it is increasingly difficult to trace and analyze individual customer satisfaction because there are more and more indirect customer touch points (e. g. portals, apps, and indirect sales channels). Because of this, many businesses are failing to establish effective customer retention programs. Smart use of data enables the identification of valuable customers who are on the point of leaving to join the competition. Big Data analytics allow a comprehensive assessment of customer satisfaction by merging multiple extensive data sources. For logistics providers, this materializes in a combined evaluation of records from customer touch points, operational data on logistics service quality, and external data. How do these pieces fit together Imagine the scenario of a logistics provider noticing a customer who lowers shipment volumes despite concurrently publishing steady sales records through newswire. The provider then checks delivery records, and realizes that this customer recently experienced delayed shipments. Looking at the bigger picture, this information suggests an urgent need for customer retention activity. To achieve this insight not just with one customer but across the entire customer base, the logistics provider must tap multiple data sources and use Big Data analytics. Customer touch points include responses to sales and marketing activities, customer service inquiries, and complaint management details. This digital customer trail is correlated with data from the distribution network comprising statistical series on shipping volume and received service quality levels. In addition, the Internet provides useful customer insight: Publicly available information from news agencies, annual reports, stock trackers, or even sentiments from social media sites enrich the logistics providers internal perspective of each customer. From this comprehensive information pool, the logistics provider can extract the attrition potential of every single customer by applying techniques such as semantic text analytics, natural-language processing, and pattern recognition. On automatically generated triggers, the provider then initiates proactive counter-measures and customer loyalty programs. Although business relationships in logistics usually relate to the sender side, loyalty management must also target the recipient side. Recipients are even more affected by poor service quality, and their feedback influences sender selection for future shipments. A good example of this is Internet or catalog shopping: Recurring customer complaints result in the vendor considering a switch of logistics provider. But to include recipients into loyalty management requires yet more data to be processed, especially in B2C markets. Big Data analytics are essential, helping to produce an integrated view of customer interactions and operational performance, and ensure sender and recipient satisfaction. Big Data in Logistics 6 Continuous service improvement and product innovation Logistics providers collect customer feedback as this provides valuable insight into service quality and customer expectations and demands. This feedback is a major source of information for continuous improvement in service quality. It is also important input for the ideation of new service innovations. To get solid results from customer feedback evaluation, it is necessary to aggregate information from as many touch points as possible. In the past, the single source of data has been ingests from CRM systems and customer surveys. But today, Big Data solutions provide access to gargantuan volumes of useful data stored on public Internet sites. In social networks and on 3.3.2 Suppy chain risk management discussion forums, people openly and anonymously share their service experiences. But extracting by hand relevant customer feedback from the natural-language content created by billions of Internet users is like looking for that proverbial needle in a haystack. The uninterrupted direct supply of materials is essential to businesses operating global production chains. Lost, delayed, or damaged goods have an immediate negative impact on revenue streams. Whereas logistics providers are prepared to control their own operational risk in supply chain services, an increasing number of disruptions result from major events such as civil unrest, natural disasters, or sudden economic developments.25 To anticipate supply chain disruptions and mitigate the effect of unforeseen incidents, global enterprises seek to deploy business continuity management (BCM) measures.26 Sophisticated Big Data techniques such as text mining and semantic analytics allow the automated retrieval of customer sentiment from huge text and audio repositories. In addition, this unsolicited feedback on quality and demand can be broken down by region and time. This enables identification of correlation with one-time incidents and tracking the effect of any initiated action. In summary, meticulous review of the entire public Internet brings unbiased customer feedback to the logistics provider. This empowers product and operational managers to design services capable of meeting customer demand. This demand for improved business continuity creates an opportunity for logistics providers to expand their customer value in outsourced supply chain operations. Rapid analysis of various information streams can be used to forecast events with a potentially significant or disastrous impact on customer business. In response to arising critical conditions, counter-measures can be initiated early to tackle arising business risks. Are you ready for anything, DHL Supply Chain Matters, 2011, cf. dhlsupplychainmatters. dhlefficiencyarticle24are-you-ready - 25 for-anything Making the right risk decisions to strengthen operations performance, PriceWaterhouseCoopers and MIT Forum for Supply Chain Innovation, 2013 26 23 24 Big Data in Logistics 7 Risk evaluation and resilience planning Contract logistics providers know their customers supply chains in great detail. To cater for the customer need for predictive risk assessment, two things must be linked and continuously checked against each other: A model describing all elements of the supply chain topology, and monitoring of the forces that affect the performance of this supply chain. Data on local developments in politics, economy, nature, health, and more must be drawn from a plethora of sources (e. g. social media, blogs, weather forecasts, news sites, stock trackers, and many other publically available sites), and then aggregated and analyzed. Most of this information stream is unstructured and continuously updated, so Big Data analytics power the retrieval of input that is meaningful in the detection of supply chain risks. Both semantic analytics and complex event processing techniques are required to detect patterns in this stream of interrelated information pieces.27 The customer is notified when a pattern points to a critical condition arising for one of the supply chain elements (e. g. a tornado warning in the region where a transshipment point is located). This notification includes a report on the probability and impact of this risk, and provides suitable counter-measures to mitigate potential disruption. Equipped with this information, the customer can re-plan transport routes or ramp up supplies from other geographies. Robust supply chains that are able to cope with unforeseen events are a vital business capability in todays rapidly changing world. In addition to a resilient and flexible supply chain infrastructure, businesses need highly accurate risk detection to keep running when disaster strikes. With Big Data tools and techniques, logistics providers can secure customer operations by performing predictive analytics on a global scale. Coming Soon A New Supply Chain Risk Management Solution by DHL A unique consultancy and software solution that improves the resilience of your entire supply chain Designed to reduce emergency costs, maintain service levels, protect sales, and enable fast post-disruption recovery Protects your brand and market share, informs your inventory decisions, and creates competitive advantage The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, David C. Luckham, Addison-Wesley Long - 27 man, 2001 Big Data in Logistics 3.4 Use Cases New Business Models 3.4.1 B2B demand and supply chain forecast The logistics sector has long been a macroeconomic indicator, and the global transportation of goods often acts as a benchmark for future economic development. The type of goods and shipped volumes indicate regional demand and supply levels. The predictive value of logistics data for the global economy is constituted by existing financial indices measuring the macroeconomic impact of the logistics sector. Examples are the Baltic Dry Index28, a price index for raw materials shipped, and the Dow Jones Transportation Average29, showing the economic stability of the 20 largest U. S. logistics providers. By applying the power of Big Data analytics, logistics providers have a unique opportunity to extract detailed microeconomic insights from the flow of goods through their distribution networks. They can exploit the huge digital asset that is piled up from the millions of daily shipments by capturing demand and supply figures in various geographical and industry segments. 8 The result has high predictive value and this compound market intelligence is therefore a compelling service that can be offered by third parties. To serve a broad range of potential customers, the generated forecasts are segmented by industry, region, and product category. The primary target groups for advanced data services such as these are small and medium-sized enterprises that lack capacity to conduct their own customized market research. Market intelligence for small and medium-sized enterprises The aggregation of shipment records (comprising origin, destination, type of goods, quantity, and value) is an extensive source of valuable market intelligence. As long as postal privacy is retained, logistics providers can refine this data in order to substantiate existing external market research. With regression analysis, DHL Geovista the fine-grained information in a shipment database can significantly enhance the precision of conventional demand and supply forecasts. Online geo marketing tool for SMEs to analyze business potential Provides realistic sales forecast and local competitor analysis based on a scientific model A desired location can be evaluated by using high-quality geodata deutschepost. degeovista Baltic Dry Index, Financial Times Lexicon, cf. lexicon. ftTermtermBaltic-Dry-Index 28 Dow Jones Transportation Average, SampP Dow Jones Indices, cf. djaveragesgotransportation-overview 29 25 26 Big Data in Logistics 9 Financial demand and supply chain analytics Financial analysts depend on data to generate their growth perspectives and stock ratings. Sometimes analysts even perform manual checks on supply chains as the only available source to forecast sales figures or market volumes. So for ratings agencies and advisory firms in the banking and insurance sector, access to the detailed information collected from a global distribution network is particularly valuable. An option for logistics providers is to create a commercial analytics platform allowing a broad range of users to slice and dice raw data according to their field of research effectively creating new revenue streams from the huge amount of information that controls logistics operations. 10 In the above use cases, analytics techniques are applied to vast amounts of shipment data. This illustrates how logistics providers can implement new informationdriven business models. In addition, the monetization of data that already exists adds the potential of highly profitable revenue to the logistics providers top line. 3.4.2 Real-time local intelligence Information-driven business models are frequently built upon existing amounts of data, but this is not a prerequisite. An established product or service can also be extended in order to generate new information assets. For logistics providers, the pickup and delivery of shipments provides a particular opportunity for a complementary new business model. No other industry can provide the equivalent blanket-coverage local presence of a fleet of vehicles that is constantly on the move and geographically distributed. Logistics providers can equip these vehicles with new devices (with camera, sensor, and mobile connectivity miniaturization powered by the Internet of Things) to collect rich sets of information on the go. This unique capability enables logistics providers to offer existing and new customers a completely new set of value-added data services. Address verification The verification of a customers delivery address is a fundamental requirement for online commerce. Whereas address verification is broadly available in industrialized nations, for developing countries and in remote areas the quality of address data is typically poor. This is also partly due to the lack of structured naming schemes for streets and buildings in some locations. Logistics providers can use daily freight, express, and parcel delivery data to automatically verify address data to achieve, for example, optimized route planning with correct geocoding for retail, banking, and public sector entities. DHL Address Management Direct match of input data with reference data Return incomplete or incorrect incoming data with validated data from database Significant increase of data quality for planning purposes (route planning) Big Data in Logistics 11 Environmental intelligence The accelerated growth of urban areas30 increases the importance of city planning activities and environmental monitoring. By using a variety of sensors attached to delivery vehicles, logistics providers can produce rich environmental statistics. Data sets may include measurements of ozone and fine dust pollution, temperature and humidity, as well as traffic density, noise, and parking spot utilization along urban roads. As all of this data can be collected en passant (in passing), it is relatively easy for logistics providers to offer a valuable data service to authorities, environment agencies, and real-estate developers while achieving complementary revenues to subsidize, for example, the maintenance of a large delivery fleet. There are numerous other local intelligence use cases exploiting the ubiquity of a large delivery fleet. From road condition reports that steer plowing or road maintenance squads, to surveys on the thermal insulation of public households, logistics providers are in pole position as search engines in the physical world. Innovative services that provide all kinds of data in microscopic geographical detail are equally attractive to advertising agencies, construction companies, and public bodies such as police and fire departments. Big Data techniques that extract structured information from real-time footage and sensor data are now building a technical backbone for the deployment of new data-driven business models. 3.5 Succcess Factors for Implementing Big Data Analytics Our discussion of Big Data analytics has been focused on the value of information assets and the way in which logistics providers can leverage data for better business performance. This is a good start, as solid use cases are a fundamental requirement for adopting new information-driven business models. But there needs to be more than a positive assessment of business value. The following five success factors must also be in place. 3.5.1 Business and IT alignment In the past, advancements in information management clearly targeted either a business problem or a technology problem. While trends such as CRM strongly affected the way sales and service people work, other trends such as cloud computing have caused headaches for IT teams attempting to operate dynamic IT resources across the Internet. Consequently, business units and the IT department may have different perspectives on which changes are worth adopting and managing. But for an organization to transform itself into an information-driven company one that uses Big Data analytics for competitive advantage both the business units and the IT department must accept and support substantial change. It is therefore essential to demonstrate and align both a business case and an IT case for using Big Data (including objectives, benefits, and risks). To complete a Big Data implementation, there must be a mutual understanding of the challenges as well as a joint commitment of knowledge and talent. According to the United Nations, by 2050 85.9 of the population in developed countries will live in urban areas. Taken from: Open-air computers, 30 The Economist, Oct. 27, 2012 cf. economistnewsspecial-report21564998-cities-are-turning-vast-data-factories-open-air-computers 27 28 Big Data in Logistics 3.5.2 Data transparency and governance Big Data use cases often build upon a smart combination of individual data sources which jointly provide new perspectives and insights. But in many companies the reality is that three major challenges must be addressed to ensure successful implementation. First, to locate data that is already available in the company, there must be full transparency of information assets and ownership. Secondly, to prevent ambiguous data mapping, data attributes must be clearly structured and explicitly defined across multiple databases. And thirdly, strong governance on data quality must be maintained. The validity of mass query results is likely to be compromised unless there are effective cleansing procedures to remove incomplete, obsolete, or duplicate data records. And it is of utmost importance to assure high overall data quality of individual data sources because with the boosted volume, variety, and velocity of Big Data it is more difficult to implement efficient validation and adjustment procedures. 3.5.3 Data privacy In the conceptual phase of every Big Data project, it is essential to consider data protection and privacy issues. Personal data is often revealed when exploiting information assets, especially when attempting to gain customer insight. Use cases are typically elusive in countries with strict data protection laws, yet legislation is not the only constraint. Even when a use case complies with prevailing laws, the large-scale collection and exploitation of data often stirs public debate and this can subsequently damage corporate reputation and brand value. or breaks reliable and meaningful insights. In most industries, the required mathematical and statistical skillset is scarce. In fact, a talent war is underway, as more and more companies recognize they must source missing data science skills externally. Very specialized knowledge is required to deploy the right techniques for each particular data processing problem, so organizations must invest in new HR approaches in support of Big Data initiatives. 3.5.5 Appropriate technology usage Many data processing problems currently hyped as Big Data challenges could, in fact, have been technically solved five years ago. But back then, the required technology investment would have shattered every business case. Now at a fraction of the cost, raw computing power has exponentially increased, and advanced data processing concepts are available, enabling a new dimension of performance. The most prominent approaches are in-memory data storage and distributed computing frameworks. However, these new concepts require adoption of entirely new technologies. 3.5.4 Data science skills For IT departments to implement Big Data projects therefore requires a thorough evaluation of established and new technology components. It needs to be established whether these components can support a particular use case, and whether existing investments can be scaled up for higher performance. For example, in-memory databases (such as the SAP HANA system) are very fast but have a limited volume of data storage, while distributed computing frameworks (such as the Apache Hadoop framework) are able to scale out to a huge number of nodes but at the cost of delayed data consistency across multiple nodes. A key to successful Big Data implementation is mastery of the many data analysis and manipulation techniques that turn vast raw data into valuable information. The skillful application of computational mathematics makes In summary, these are the five success factors that must be in place for organizations to leverage data for better business performance. Big Data is ready to be used. Outlook OUTLOOK Looking ahead, there are admittedly numerous obstacles to overcome (data quality, privacy, and technical feasibility, to name just a few) before Big Data has pervasive influence in the logistics industry. But in the long run, these obstacles are of secondary importance because, first and foremost, Big Data is driven by entrepreneurial spirit. Several organizations have led the way for us Google, Amazon, Facebook, and eBay, for example, have already succeeded in turning extensive information into business. Now we are beginning to see first movers in the logistics sector. These are the entrepreneurial logistics providers that refuse to be left behind the opportunity-oriented organizations prepared to exploit data assets in pursuit of the applications described in this trend report. But apart from the leading logistics providers that implement specific Big Data opportunities, how will the entire logistics sector transform into a data-driven industry What evolution can we anticipate in a world where virtually every single shipped item is connected to the Internet We may not know all of the answers right now. But this trend report has shown there is plenty of headroom for valuable Big Data innovation. Joining resources, labor, and capital, it is clear that information has become the fourth production factor and essential to competitive differentiation. It is time to tap the potential of Big Data to improve operational efficiency and customer experience, and create useful new business models. It is time for a shift of mindset, a clear strategy and application of the right drilling techniques. Over the next decade, as data assumes its rightful place as a key driver in the logistics sector, every activity within DHL is bound to get smarter, faster, and more efficient. 29 FOR MORE INFORMATION About Big Data in Logistics, contact: RECOMMENDED READING LOGISTICS TREND RADAR Dr. Markus Kckelhaus DHL Customer Solutions amp Innovation Junkersring 57 53844 Troisdorf, Germany Phone: 49 2241 1203 230 Mobile: 49 152 5797 0580 e-mail: markus. kueckelhausdhl Katrin Zeiler DHL Customer Solutions amp Innovation Junkersring 57 53844 Troisdorf, Germany Phone: 49 2241 1203 235 Mobile: 49 173 239 0335 e-mail: katrin. zeilerdhl dhltrendradar KEY LOGISTICS TRENDS IN LIFE SCIENCES 2020 dhllifesciences2020. View Full Document This document was uploaded on 11302016 for the course MS 6721 at City University of Hong Kong. Click to edit the document details Share this link with a friend: Most Popular Documents for MS 6721 8NetworkReadingSciRep2012.pdf City University of Hong Kong MS 6721 - Winter 2016 Understanding Road Usage Patterns in Urban Areas SUBJECT AREAS: APPLIED PHYSICS CIVIL 7Network. pdf City University of Hong Kong MS 6721 - Winter 2016 1 SUPPLY CHAIN MANAGEMENT Lecture 7 Network and Graph Qingpeng ZHANG SEEM, City Unive 8NetworkReadingIJOPM2011.pdf City University of Hong Kong MS 6721 - Winter 2016 International Journal of Operations amp Production Management A complex network approac reading tasks2.docx City University of Hong Kong MS 6721 - Winter 2016 Summary Of Decision Support Systems nowsdays supply chain risks is becoming increasin MS6721.pdf City University of Hong Kong MS 6721 - Winter 2016 Form 2B City University of Hong Kong Information on a Course offered by Department of 3Demand. pdf City University of Hong Kong MS 6721 - Winter 2016 1 SUPPLY CHAIN MANAGEMENT Lecture 3 Demand For ecasting Qingpeng ZHANG SEEM, City Univ404 means the file is not found. If you have already uploaded the file then the name may be misspelled or it is in a different folder. Other Possible Causes You may get a 404 error for images because you have Hot Link Protection turned on and the domain is not on the list of authorized domains. If you go to your temporary url (ip username) and get this error, there maybe a problem with the rule set stored in an. htaccess file. You can try renaming that file to. htaccess-backup and refreshing the site to see if that resolves the issue. It is also possible that you have inadvertently deleted your document root or the your account may need to be recreated. Either way, please contact your web host immediately. Are you using WordPress See the Section on 404 errors after clicking a link in WordPress. Missing or Broken Files When you get a 404 error be sure to check the URL that you are attempting to use in your browser. This tells the server what resource it should attempt to request. In this example the file must be in publichtmlexampleExample Notice that the CaSe is important in this example. On platforms that enforce case-sensitivity e xample and E xample are not the same locations. For addon domains, the file must be in publichtmladdondomainexampleExample and the names are case-sensitive. Broken Image When you have a missing image on your site you may see a box on your page with with a red X where the image is missing. Right click on the X and choose Properties. The properties will tell you the path and file name that cannot be found. This varies by browser, if you do not see a box on your page with a red X try right clicking on the page, then select View Page Info, and goto the Media Tab. In this example the image file must be in publichtmlcgi-sysimages Notice that the CaSe is important in this example. On platforms that enforce case-sensitivity PNG and png are not the same locations. When working with WordPress, 404 Page Not Found errors can often occur when a new theme has been activated or when the rewrite rules in the. htaccess file have been altered. When you encounter a 404 error in WordPress, you have two options for correcting it. Option 1: Correct the Permalinks Log in to WordPress. From the left-hand navigation menu in WordPress, click Settings gt Permalinks (Note the current setting. If you are using a custom structure, copy or save the custom structure somewhere.) Select Default . Click Save Settings . Change the settings back to the previous configuration (before you selected Default). Put the custom structure back if you had one. Click Save Settings . This will reset the permalinks and fix the issue in many cases. If this doesnt work, you may need to edit your. htaccess file directly. Option 2: Modify the. htaccess File Add the following snippet of code to the top of your. htaccess file: BEGIN WordPress ltIfModule modrewrite. cgt RewriteEngine On RewriteBase RewriteRule index. php - L RewriteCond - f RewriteCond - d RewriteRule. index. php L ltIfModulegt End WordPress If your blog is showing the wrong domain name in links, redirecting to another site, or is missing images and style, these are all usually related to the same problem: you have the wrong domain name configured in your WordPress blog. The. htaccess file contains directives (instructions) that tell the server how to behave in certain scenarios and directly affect how your website functions. Redirects and rewriting URLs are two very common directives found in a. htaccess file, and many scripts such as WordPress, Drupal, Joomla and Magento add directives to the. htaccess so those scripts can function. It is possible that you may need to edit the. htaccess file at some point, for various reasons. This section covers how to edit the file in cPanel, but not what may need to be changed.(You may need to consult other articles and resources for that information.) There are Many Ways to Edit a. htaccess File Edit the file on your computer and upload it to the server via FTP Use an FTP programs Edit Mode Use SSH and a text editor Use the File Manager in cPanel The easiest way to edit a. htaccess file for most people is through the File Manager in cPanel. How to Edit. htaccess files in cPanels File Manager Before you do anything, it is suggested that you backup your website so that you can revert back to a previous version if something goes wrong. Open the File Manager Log into cPanel. In the Files section, click on the File Manager icon. Check the box for Document Root for and select the domain name you wish to access from the drop-down menu. Make sure Show Hidden Files (dotfiles) is checked. Click Go . The File Manager will open in a new tab or window. Look for the. htaccess file in the list of files. You may need to scroll to find it. To Edit the. htaccess File Right click on the. htaccess file and click Code Edit from the menu. Alternatively, you can click on the icon for the. htaccess file and then click on the Code Editor icon at the top of the page. A dialogue box may appear asking you about encoding. Just click Edit to continue. The editor will open in a new window. Edit the file as needed. Click Save Changes in the upper right hand corner when done. The changes will be saved. Test your website to make sure your changes were successfully saved. If not, correct the error or revert back to the previous version until your site works again. Once complete, you can click Close to close the File Manager window. Introduction. Lecture BigData Analytics. Julian M. Kunkel. 1 Introduction Lecture BigData Analytics Julian M. Kunkel University of Hamburg German Climate Computing Center (DKRZ) 2 Outline 1 Introduction 2 BigData Challenges 3 Analytical Workflow 4 Use Cases 5 Programming 6 Summary Julian M. Kunkel Lecture BigData Analytics, 51 3 About DKRZ German Climate Computing Center (DKRZ) Partner for Climate Research Maximum Compute Performance. Sophisticated Data Management. Competent Service. Julian M. Kunkel Lecture BigData Analytics, 51 4 Introduction BigData Challenges Analytical Workflow Use Cases Programming Summary Scientific Computing Research Group of Prof. Ludwig at the University of Hamburg Embedded into DKRZ Research Analysis of parallel IO Alternative IO interfaces IO amp energy tracing tools Data reduction techniques Middleware optimization Cost amp energy efficiency Julian M. Kunkel Lecture BigData Analytics, 51 5 Lecture Concept of the lecture The lecture is focussing on applying technology and some theory Theory Data models and processing concepts Algorithms and data structures System architectures Statistics and machine learning Applying technology Learning about various state-of-the art technology Hands-on for understanding the key concepts Languages: Java, Python, R The domain of big data is overwhelming, especially in terms of technology It is a crash course for several topics such as statistics and databases it is not the goal to learn and understand every aspect in this lecture Julian M. Kunkel Lecture BigData Analytics, 51 6 Lecture (2) Slides Many openly accessable sources have been used Citation to them by a number The reference slide provides the link to the source For figures, a reference is indicated by Source: Author 1 title ref In the title, an ref means that this reference has been used for the slide, some text may be taken literally Excercise Weekly delivery, processing time about 8 hours per week estimated Teamwork of 2 or 3 people (groups are mandatory) Supported by: Hans Ole Hatzel 1 If available Julian M. Kunkel Lecture BigData Analytics, 51 7 Idea of BigData Methods of obtaining knowledge (Erkenntnissprozess) Theory (model), hypothesis, experiment, analysis (repeat) Explorative: start theory with observations of phenomena Constructivism: starts with axioms and reason implications The Fourth Paradigm (Big) Data Analytics Insight (prediction of the future) For industry: insight business advantage and money. Analytics: follow an explorative approach and study the data To infer knowledge, use statistics machine learning Construct a theory (model) and validate it with the data Julian M. Kunkel Lecture BigData Analytics, 51 8 Example Models Similarity is a (very) simplistic model and predictor for the world Humans use this approach in their cognitive process Uses the advantage of BigData Weather prediction You may develop and rely on complex models of physics Or use a simple model for a particular day e. g. expect it to be similar to the weather of the day over the last X years Used by humans: rule of thumb for farmers Preferences of Humans Identify a set of people which liked items you like Predict you like also the items those people like (items you haven t rated so far) Julian M. Kunkel Lecture BigData Analytics, 51 9 Relevance of Big Data Big Data Analytics is emerging Relevance increases compared to supercomputing Google Search Trends, relative searches Julian M. Kunkel Lecture BigData Analytics, 51 10 1 Introduction 2 BigData Challenges Volume Velocity Variety Veracity Value 3 Analytical Workflow 4 Use Cases 5 Programming 6 Summary Julian M. Kunkel Lecture BigData Analytics, 51 11 BigData Challenges amp Characteristics Source: MarianVesper 4 Julian M. Kunkel Lecture BigData Analytics, 51 12 Volume: The size of the Data What is Big Data Terrabytes to 10s of petabytes What is not Big Data A few gigabytes Examples Wikipedia corpus with history ca. 10 TByte Wikimedia commons ca. 23 TByte Google search index ca. 46 Gigawebpages 2 YouTube per year 76 PByte ( ) 2 3 sumanrs. wordpress20120414youtube-yearly-costs-for-storagenetworking-estimate Julian M. Kunkel Lecture BigData Analytics, 51 13 Velocity: Data Volume per Time What is Big Data 30 KiB to 30 GiB per second (902 GiByear to 902 PiByear) What is not Big Data A never changing data set Examples LHC (Cern) with all experiments about 25 GBs 4 Square Kilometre Array 700 TBs (in 2018) 5 50k Google searches per s 6 Facebook 30 Billion content pieces shared per month blog. kissmetricsfacebook-statistics Julian M. Kunkel Lecture BigData Analytics, 51 14 Data Sources Enterprise data Serves business objectives, well defined Customer information Transactions, e. g. Purchases ExperimentalObservational data (EOD) Created by machines from sensorsdevices Trading systems, satellites Microscopes, video streams, Smart meters Social media Created by humans Messages, posts, blogs, Wikis Julian M. Kunkel Lecture BigData Analytics, 51 15 Variety: Types of Data Structured data Like tables with fixed attributes Traditionally handled by relational databases Unstructured data Usually generated by humans E. g. natural language, voice, Wikipedia, Twitter posts Must be processed into (semi-structured) data to gain value Semi-structured data What is Big Data Has some structure in tags but it changes with documents E. g. HTML, XML, JSON files, server logs Use data from multiple sources and in multiple forms Involve unstructured and semi-structured data Julian M. Kunkel Lecture BigData Analytics, 51 16 Veracity: Trustworthiness of Data What is Big Data Data involves some uncertainty and ambiguities Mistakes can be introduced by humans and machines People sharing accounts Like sth. today, dislike it tomorrorw Wrong system timestamps Data Quality is vital Analytics and conclusions rely on good data quality Garbage data perfect model gt garbage results Perfect data garbage model gt garbage results GIGO paradigm: Garbage In Garbage Out Julian M. Kunkel Lecture BigData Analytics, 51 17 Value of Data What is Big Data Raw data of Big Data is of low value For example, single observations Analytics and theory about the data increases the value Analytics transform big data into smart data Julian M. Kunkel Lecture BigData Analytics, 51 18 Types of Data Analytics and Value of Data 1 Descriptive analytics (Beschreiben) What happened 2 Diagnostic analytics Why did this happen, what went wrong 3 Predictive analytics (Vorhersagen) What will happen 4 Prescriptive analytics (Empfehlen) What should we do and why The level of insight and value of data increases from step 1 to 4 Julian M. Kunkel Lecture BigData Analytics, 51 19 Introduction BigData Challenges Analytical Workflo w Use Cases Programming Summary The Value of Data (alternative view) Source: Dursun Delen, Haluk Demirkan 9 Julian M. Kunkel Lecture BigData Analytics, 51 20 The Value of Data (alternative view 2) Source: Forrester report. Understanding The Business Intelligence Growth Opportunity Julian M. Kunkel Lecture BigData Analytics, 51 21 1 Introduction 2 BigData Challenges 3 Analytical Workflow Value Chain Roles Privacy 4 Use Cases 5 Programming 6 Summary Julian M. Kunkel Lecture BigData Analytics, 51 22 Big Data Analytics Value Chain There are many visualizations of the processing and value chain 8 Source: Andrew Stein 8 Julian M. Kunkel Lecture BigData Analytics, 51 23 Big Data Analytics Value Chain (2) Source: Miller and Mork 7 Julian M. Kunkel Lecture BigData Analytics, 51 24 Roles in the Big Data Business Data scientist Data science is a systematic method dedicated to knowledge discovery via data analysis 1 In business, optimize organizational processes for efficiency In science, analyze experimentalobservational data to derive results Data engineer Data engineering is the domain that develops and provides systems for managing and analyzing big data Build modular and scalable data platforms for data sci entists Deploy big data solutions Julian M. Kunkel Lecture BigData Analytics, 51 25 Typical Skills Data scientist Statistics (Mathematics) Computer science Programming e. g. Java, Python, R, (SAS. ) Machine learning Some domain knowledge for the problem to solve Data engineer Computer science Databases Software engineering Massively parallel processing Real-time processing Languages: C, Java, Python Understand performance factors and limitations of systems Julian M. Kunkel Lecture BigData Analytics, 51 26 Data Science vs. Business Intelligence (BI) Characteristics of BI Provides pre-created dashboards for management Repeated visualization of well known analysis steps Deals with structured data Typically data is generated within the organization Central data storage (vs. multiple data silos) Handeled well by specialized database techniques Typical types of insight Customer service data: what business causes the largest customer wait times Sales and marketing data: which marketing is most effective Operational data: efficiency of the help desk Employee performance data: who is mostleast productive Julian M. Kunkel Lecture BigData Analytics, 51 27 Privacy B e aware of privacy issues if you deal with personalprivate information. German privacy laws are more strict than those of other countries Ziel des Datenschutzes Recht auf informationelle Selbstbestimmung Schutz des Einzelnen vor beeintraumlchtigung des Persoumlnlichkeitsrechts durch den Umgang mit seinen personenbezogenen 8 Daten Besonderer Schutz fuumlr Daten uumlber Gesundheit, ethnische Herkunft, religioumlse, gewerkschaftschliche oder sexuelle Orientierung 8 3 BDSG, Einzelangaben uumlber persoumlnliche oder sachliche Verhaumlltnisse einer bestimmten oder bestimmbaren natuumlrlichen Person Julian M. Kunkel Lecture BigData Analytics, 51 28 Wichtige Grundsaumltze des Gesetzes 10 Verbotsprinzip mit Erlaubsnisvorbehalt Erhebung, Verarbeitung, Nutzung und Weitergabe von personenbezogenen Daten sind verboten Nutzung nur mit Rechtsgrundlage oder mit Zustimmung der Person Unternehmen mit 10 Personen benoumltigen Datenschutzbeauftragten Verfahren zur automatischen Verarbeitung sind vom Datenschutzbeauftragten zu pruumlfen und anzeigepflichtig Sitz der verantwor tlichen Stelle maszliggeblich Bei einer Niederlassung in D gilt BDSG Prinzipien: Datenvermeidung, - sparsamkeit Schutz vor Zugriffen, Aumlnderungen und Weitergabe Betroffene haben Recht auf Auskunft, Loumlschung oder Sperrung AnonymisierungPseudonymisierung: Ist die Zuordnung zu Einzelpersonen (nahezu) ausgeschlossen, so koumlnnen Daten verabeitet werden Julian M. Kunkel Lecture BigData Analytics, 51 29 1 Introduction 2 BigData Challenges 3 Analytical Workflow 4 Use Cases Overview 5 Programming 6 Summary Julian M. Kunkel Lecture BigData Analytics, 51 30 Source: 21 Julian M. Kunkel Lecture BigData Analytics, 51 31 Use Cases for BigData Analytics Increase efficiency of processes and systems Advertisement: Optimize for target audience Product: Acceptance (likedislike) of buyer, dynamic pricing Decrease financial risks: fraud detection, account takeover Insurance policies: Modeling of catastrophes Recommendation engine: Stimulate purchaseconsume Systems: Fault prediction and anomaly dete ction Supply chain management Science Epidemiology research: Google searches indicate Flu spread Personalized Healthcare: Recommend good treatment Physics: Finding the Higgs-Boson, analyze telescope data Enabler for social sciences: Analyze people s mood Julian M. Kunkel Lecture BigData Analytics, 51 32 Big Data in Industry Source: 20 Julian M. Kunkel Lecture BigData Analytics, 51 33 Example Use Case: Deutschland Card 2 Goals Customer bonus card which tracks purchases Increase scalability and flexibility Previous solution based on OLAP Big Data Characteristics Volume: O(10) TB Variety: mostly structured data, schemes are extended steadily Velocity: data growth rate O(100) GB month Results Much better scalability of the solution From dashboards to ad-hoc analysis within minutes Julian M. Kunkel Lecture BigData Analytics, 51 34 Example Use Case: DM 2 Goals Predict required employees per day and store Prevent staff changes on short-notice Big Data Characteristics Results Input data: O pening hours, incoming goods, empl. preferences, holidays, weather. Model: NeuroBayes (Bayes neuronal networks) Predictions: Sales, employee planning predictions per week Daily updated sales per store Reliable predictions for staff planning Customer and employee satisfaction Julian M. Kunkel Lecture BigData Analytics, 51 35 Example Use Case: OTTO 2 Goals Optimize inventory and prevent out-of-stock situations Big Data Characteristics Input data: product characteristics, advertisement VolumeVelocity: 135 GBweek, 300 million records Model: NeuroBayes (Bayes neuronal networks) 1 billion predictions per year Results Better prognostics of product sales (up to 40) Real time data analytics Julian M. Kunkel Lecture BigData Analytics, 51 36 Example Use Case: Smarter Cities (by KTH) 2 Goals Improve traffic management in Stockholm Prediction of alternative routes Big Data Characteristics Input data: Traffic videossensors, weather, GPS VolumeVelocity: 250k GPS-datas other data sources Results 20 less traffic 50 reduction in travel time 20 less emissions Julian M. Kunkel Lecture BigData Analytics, 51 37 Example Facebook Studies Insight from 11 by exploring posts Young narcissists tweet more likely. Middle-aged narcissists update their status US students post more problematic information than German students US Government checks tweetsfacebook messages for several reasons Human communication graph has an average diameter of 4.74 Manipulation of news feeds 13 News feeds have been changed to analysis people s behavior in subsequent posts Paper: Experimental evidence of massive-scale emotional contagion through social networks Julian M. Kunkel Lecture BigData Analytics, 51 38 From Big Data to the Data Lake 20 With cheap storage costs, people promote the concept of the data lake Combines data from many sources and of any type Allows for conducting future analysis and not miss any opportunity Attributes of the data lake Collect everything: all data, both raw sources over extended periods of time as well as any processed data Decide during analysis which data is important, e. g. no schema until read Dive in anywhere: enable users across multiple business units to refine, explore and enrich data on their terms Flexible access: enable multiple data access patterns across a shared infrastructure: batch, interactive, online, search, and others Julian M. Kunkel Lecture BigData Analytics, 51 39 1 Introduction 2 BigData Challenges 3 Analytical Workflow 4 Use Cases 5 Programming Java Python R 6 Summary Julian M. Kunkel Lecture BigData Analytics, 51 40 Programming BigData Analytics High-level concepts SQL and derivatives Domain-specific languages (Cypher, PigLatin) Programming languages Java interfaces are widely available but low-level Python and R have connectors to popular BigData solutions In the exercises, we ll learn and use basics of those languagesinterfaces Julian M. Kunkel Lecture BigData Analytics, 51 41 Introduction to Java Developed by Sun Microsystems in 1995 Object oriented programming language OpenJDK implementation is open source Source code byte co de just-in-time compiler Byte code is portable amp platform independent Virtual machine abstracts from systems Strong and static type system Popular language for Enterprise amp Big Data applications Most popular programming language (Pos. 1 on the TIOBE index) Development tools: Eclipse Specialties Good runtime and compile time error reporting Generic data types (vs. templates of C) Introspection via. Reflection Julian M. Kunkel Lecture BigData Analytics, 51 44 Introduction to Python Open source Position 5 on TIOBE index Interpreted language Weak type system (errors at runtime) Development tools: any editor, interactive shell Note: Use and learn python3 explicitly Recommended plotting library: matplotlib 9 Specialties Strong text processing Simple to use Support for object oriented programming Indentation is relevant for code blocks 9 Julian M. Kunkel Lecture BigData Analytics, 51 45 Example Python Program 1 binenv python 2 import re use the module re 3 4 function reading a file 5 def readfile(filename): 6 with open(filename, r ) as f: 7 data f. readlines() 8 f. close() 9 return data 10 return return an empty arraylist the main function 13 if name quot main quot: 14 data readfile( intro. py ) 15 iterate over the array 16 for x in data: 17 extract imports from a python file using a regex 18 m re. match(quotimport t(pltwhatgt )quot, x) 19 if m: 20 print(m. group(quotwhatquot)) 21 dictionary (key value pair) 22 dic m. groupdict() 23 dic. update( ) append a new dict. with one key 24 use format string with dictionary 25 print(quotfound import (WHAT)s in file (FILE)squot dic ) 26 Prints: Found import re in file intro. py Julian M. Kunkel Lecture BigData Analytics, 51 46 Example Python Classes 1 from abc import abstractmethod 2 3 class Animal(): 4 constructor, self are instance methods, else class methods 5 def init (self, weight): 6 self. weight weight private variables start with 7 8 decorator 10 def name(self): 11 return self. class. name reflection like def str (self): 14 return quotI m a s with weight fquot (self. name(), self. weight) class Rabbit(Animal): 17 def init (self): 18 super() is available with python 3 19 super(). init (2.5) def name(self): 22 return quotSmall Rabbitquot override name if name quot main quot: 25 r Rabbit() 26 print(r) print: I m a Small Rabbit with weight Julian M. Kunkel Lecture BigData Analytics, 51 47 Introduction to R Based on S language for statisticians Open source Position 19 on TIOBE index Interpreter with C modules (packages) Easy installation of packages via CRAN 10 Popular language for data analytics Development tools: RStudio (or any editor), interactive shell Recommended plotting library: ggplot2 11 Specialties Vectormatrix operations. Note: Loops are slow, so avoid them Table data structure (data frames) 10 Comprehensive R Archive Network 11 Julian M. Kunkel Lecture BigData Analytics, 51 48 Course for Learning R Programming 1 Run with quotRscript intro. rquot or run quotRquot and copyamppaste into interactive shell 2 Installing a new package is as easy as: 3 install. packages(quotswirlquot) 4 Note: sometimes packages are not available on all mirrors 5 library(swirl) load the package 6 7 help(swirl) read help about the function swirl swirl() start an interactive course to learn R 11 a simple for loop 12 for (x in 1:10) else 18 Julian M. Kunkel Lecture BigData Analytics, 51 49 Example R Program 1 create an array 2 x c(1, 2, 10:12) 3 4 apply an operator on the full vector and output it 5 print( x2 ) prints: slice arrays 8 print ( x3:5 ) prints: print( xc(1,4,8) ) prints: 1 11 NA r runif(100, min0, max100) create array with random numbers 12 m matrix(r, ncol4, byrow TRUE) create a matrix slice matri x rows quotmrow(s), column(s)quot 15 print( m10:12, ) Output: 16 ,1 ,2 ,3 ,4 17 1, 2, 3, slice rows amp columns 22 print ( m10, c(1,4) ) Output: 1 subset the table based on a mask 25 set m (m,1 lt 20 amp m,2 gt 2), Julian M. Kunkel Lecture BigData Analytics, 51 51 Summary Big data analytics Explore data and model causalities to gain knowledge amp value Challenges: 5 Vs Volume, velocity, variety, veracity, value Data sources: Enterprise, humans, Exp. Observational data (EOD) Types of data: Structured, unstructured and semi-structured Levels of analytics: Descriptive, predictive and prescriptive Roles in big data business: Data scientist and engineer Data science business intelligence Julian M. Kunkel Lecture BigData Analytics, 51 52 Bibliography 1 Book: Lillian Pierson. Data Science for Dummies. John Wiley amp Sons 2 Report: Juumlrgen Urbanski et. al. Big Data im Praxiseinsatz Szenarien, Beispiele, Effekte. BITKOM 3 4 Forrester Big Data Webinar. Holger Kisker, Martha Bennet. Big Data: Gold Rush Or Illusion Gilbert Miller, Peter Mork From Data to Decisions: A Value Chain for Big Data. 8 Andrew Stein. The Analytics Value Chain. 9 Dursun Delen, Haluk Demirkan. Decision Support Systems, Data, information and analytics as services. j.mp11bl9b9 10 Wikipedia 11 Kashmir Hill. 46 Things We ve Learned From Facebook Studies. Forbe. 12 Hortonworks Julian M. Kunkel Lecture BigData Analytics, 51

No comments:

Post a Comment