Tag: apache-hadoop

33 Do I need to learn Hadoop to be a Data Scientist? 2014-06-10T06:20:20.817

27 What are the use cases for Apache Spark vs Hadoop 2014-06-17T20:48:35.267

13 What is the difference between Hadoop and noSQL 2014-05-14T10:44:58.933

11 Can map-reduce algorithms written for MongoDB be ported to Hadoop later? 2014-05-18T12:03:21.650

11 Does Amazon RedShift replace Hadoop for ~1XTB data? 2014-06-11T04:24:04.183

10 Tradeoffs between Storm and Hadoop (MapReduce) 2014-06-01T10:25:51.163

9 What are R's memory constraints? 2014-05-14T17:48:21.240

7 Cascaded Error in Apache Storm 2014-06-01T12:51:25.040

7 Linear Regression in R Mapreduce(RHadoop) 2014-07-03T10:49:50.993

7 Data science and MapReduce programming model of Hadoop 2014-07-28T16:17:49.823

6 Good books for Hadoop, Spark, and Spark Streaming 2014-12-05T05:50:29.903

6 Lambda Architecture - How to implement the Merge Layer / Query Layer 2015-01-02T20:03:59.950

5 Processing data stored in Redshift 2014-11-12T17:27:57.850

5 Improve k-means accuracy 2016-02-02T01:42:38.053

4 HBase connector - Thrift or REST 2014-06-10T06:19:46.510

4 Hadoop for grid computing 2014-09-04T18:13:57.343

4 How to set up multi cluster spark without hadoop on Google Compute engine 2014-12-07T16:31:57.913

4 Can we access HDFS file system and YARN scheduler in Apache Spark? 2015-01-30T18:55:46.173

4 Storing Sensor Data for Analysis of the Office 2015-07-03T08:52:36.247

4 Is there a benefit to using hadoop with only one node? 2015-10-11T01:05:04.113

4 Saving Large Spark ML Pipeline to HDFS 2018-01-08T16:19:33.187

3 Difference Between Hadoop Mapreduce(Java) and RHadoop mapreduce 2014-06-27T12:03:53.357

3 Cloudera QuickStart VM Error 2014-07-09T17:51:40.583

3 Hive: How to calculate the Kendall coefficient of correlation of a pair of a numeric columns in the group? 2014-12-01T14:52:31.827

3 java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream 2014-12-09T00:58:03.057

3 Data produced as an output to Dumbo API of Python not getting distributed to all the nodes of cluster 2015-06-27T06:34:46.957

3 Is our data "Big Data" (Startup) 2015-07-28T14:55:13.340

3 Can all statistical algorithms be parallelized using a Map Reduce framework 2015-08-26T20:44:06.540

3 unable to parse XML in pig 2016-05-10T16:35:06.310

3 Skills that school doesn't teach you 2016-08-17T19:08:17.143

3 Deploying models on bigdata platforms like Hadoop and Spark 2017-03-09T12:32:53.170

2 Cannot make user directory on a new CDH5 installation (Hadoop) 2014-07-03T14:18:14.387

2 Hadoop Resource Manager Won't Start 2014-07-19T20:51:58.527

2 Pig script code error? 2014-07-24T06:26:07.290

2 Pig latin code error 2014-07-24T06:34:50.083

2 Pig Rank function not generating rank in output 2014-08-08T17:32:48.377

2 Using Shark with Apache Spark 2014-08-26T21:37:12.107

2 Differences in scoring from PMML model on different platforms 2014-10-17T13:58:39.353

2 Error when using MAX in Apache Pig (Hadoop) 2015-02-09T00:18:46.427

2 How append works in hdfs? Where the newly created instance of file is placed? 2015-03-05T18:28:43.367

2 Can Hadoop be beneficial when data is in database tables and not in a file system 2015-08-26T20:49:40.957

2 How to make k-means distributed? 2016-02-06T02:38:20.750

2 How to read contents of a CSV file inside zip file using spark (python) 2016-05-05T23:43:27.647

2 Mahout Spark shell not working 2016-11-02T09:44:46.437

2 How many people can use a single Hadoop cluster at one time? 2016-11-13T20:25:56.503

2 K-means clustering on big data stored on multiple nodes on HDFS 2017-02-10T06:02:39.710

2 Ingestion of periodic REST API Calls into Hadoop 2017-03-07T14:06:42.100

2 Are jobs the only way out for data scientists? 2017-05-09T08:57:50.920

2 Why has Hadoop failed to become popular? 2017-05-29T18:21:27.990

2 How to setup a home-laptop cluster to 'practice' elasticsearch, hadoop, mesos and spark 2017-07-06T18:02:16.953

1 Masters thesis topics in big data 2014-10-19T11:02:44.397

1 Hadoop/Pig Aggregate Data 2014-12-23T19:46:57.267

1 Can we use HDFS and big data Analytics for processing huge log files being processed through some application on some central server? 2015-06-18T07:09:08.140

1 How to use REST API to execute Map-Reduce Task? 2015-07-07T09:23:45.250

1 how to disable query from beeline results 2015-11-03T13:09:48.553

1 Pig is not able to read the complete data 2015-12-17T07:58:13.057

1 A Simple Explanation of ZooKeeper in Hadoop 2016-01-11T12:19:27.103

1 Do something with the output of reducers 2016-02-07T05:19:47.063

1 freebcp getting stalled for huge data 2016-02-11T13:36:17.973

1 Is there a text on Apache Spark that attempts to be as comprehensive as White's Hadoop: The Definitive Guide'? 2016-06-04T11:37:38.327

1 Suggestions on what patterns/analysis to derive from Airlines Big Data 2016-06-22T17:26:53.710

1 How to multiply a "fat and short" matrix with a "tall and thin" matrix using MapReduce? 2016-07-01T09:14:41.140

1 How to Scaling Out Artifical Neural Networks? 2016-10-31T07:33:34.623

1 spark item similarity recommendation 2016-11-01T09:20:22.320

1 Can R + Hadoop overcome R's memory constraints in any case? 2017-03-24T20:19:21.747

1 Hadoop Cluster Capacity Planning 2017-08-15T10:20:16.817

1 Hadoop and input informations divided in splits 2017-12-26T18:42:37.020

1 Yarn service parameter for pseudodistributed 2017-12-31T11:39:33.533

1 Hadoop - checksum while reading file from client 2018-01-05T12:54:06.900

1 Hdfs Data Balance on Cluster 2018-01-06T12:17:45.160

1 Tasks of a Yarn Process in Hadoop 2018-01-06T12:58:10.217

1 Best practice for developing using Spark 2018-02-09T12:59:34.347

0 Can hadoop with Spark be configured with 1GB RAM 2014-12-07T04:40:53.677

0 Questions on "Active Archive" 2015-01-24T15:15:27.777

0 Extract company names/job titles from free text 2015-02-09T17:28:54.390

0 When it is time to use Hadoop? 2015-02-17T19:10:05.547

0 Accessing directory of small files as one file 2015-07-28T12:40:39.007

0 Yarn timeline recovery not enabled error upgrading via ambari 2015-10-25T23:37:21.817

0 Predictive Analytics on distributed systems vs standalone system 2016-07-01T18:51:04.507

0 Spark algorithm to make a link analysis 2016-08-26T13:13:06.470

0 Machine Learning model to find items that are frequently bought together using Hadoop Spark 2016-08-29T14:15:22.463

0 what ETL technique should i use for text documents using Hadoop? 2017-04-16T17:40:16.257

0 Is it possible to perform a bitwise group function in Hive? 2017-08-17T11:01:29.577

0 Why my master node got heap memory full for inbuilt SVD API in Apache Spark during calculation of inverse of a square matrix? 2017-11-15T11:00:26.433

0 Custom Writable Serialization in Hadoop 2018-01-03T10:28:22.260

0 Which one of these tasks will benefit the most from SPARK? 2018-01-07T12:31:32.830

0 Hadoop - Capacity Scheduler and Fifo scheduler 2018-01-15T20:57:14.983

0 spark.dynamicAllocation + setting the spark parameters according to ambari cluster 2018-02-09T00:16:01.867

0 How to extract and convert 18GB compressed wikipedia dataset in hadoop 2018-03-02T05:48:52.197

0 org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4 2018-03-07T10:19:24.930

-1 getting error:-Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable 2015-05-28T07:38:38.790

-1 Connecting Twitter API to a Big Data Environment? 2016-10-19T06:45:22.967

-1 Install Spark and Hadoop in the same machine 2017-08-23T12:40:16.087

-5 Does a big data virtual machine machine help in analyzing large file? 2016-11-08T04:12:52.277