https://github.com/yanagishima/yanagishima
https://github.com/szyn/docker-yanagishima
* docker 시에는 presto 는 제외한다. PRESTO_COODINATOR_URL 만 연결한다.
만일 필요한 경우라면 2가지 절차를 따른다. hive & presto 에 추가한다.
1> docker-compose.yml
presto:
image: skame/presto:0.189
ports:
- "8091:8080"
volumes:
- .docker/presto/etc:/opt/presto/etc
2> .docker/presto/etc/catalog/hive.properties
connector.name=hive-hadoop2
hive.metastore.uri=thrift://localhost:10000
https://github.com/shawnzhu/docker-prestodb
CREATE TABLE local.default.review (
id bigint ,
created_at date ,
updated_at date ,
use_yn varchar ,
access_token varchar ,
apply_at date ,
campaign_apply_id bigint ,
campaign_id bigint ,
content varchar,
feed_count_in30 int ,
follow_count int ,
member_id bigint ,
ouuid char ,
recommend_text varchar ,
recommend_yn varchar ,
result_at date ,
score double ,
shared_url varchar ,
link_url varchar,
video_url varchar,
ship_at date ,
sns_code varchar ,
sns_id varchar ,
status_code varchar ,
tag varchar ,
view_count int ,
product_id bigint ,
device_type int ,
review_type varchar,
has_video int,
sales_id bigint,
ze_order_detail_id bigint ,
category_id bigint,
product_name varchar,
brand_name varchar,
review_product_slave_uid bigint,
review_brand_slave_uid bigint,
country varchar
)
WITH (format = 'ORC')
insert into hive.default.report_review_20191226 (campaign_name, review_content, image_url)
select c.name, r.content, i.url
from campaign c inner join review r on c.id = r.campaign_id
inner join image i on r.ouuid = i.ouuid
$ pip install presto-python-client
import prestodb
from prestodb import transaction
with prestodb.dbapi.connect(
host='10.0.0.34',
port=8080,
user='root',
catalog='local',
schema='default',
) as conn:
cur = conn.cursor()
#cur.execute('INSERT INTO sometable VALUES (1, 2, 3)')
#cur.execute('INSERT INTO sometable VALUES (4, 5, 6)')
query="SELECT * FROM local.default.report_review_20191230_v1 where campaign_name like '%만두%'"
cur.execute(query)
rows = cur.fetchall()
print(rows)
import pandas as pd
from pyhive import presto
connection = presto.connect(host='10.0.0.34', port=8080)
df = pd.read_sql_query("SELECT * FROM local.default.report_review_20191230_v1 where campaign_name like '%만두%'", connection)
print(df.head())
import pandas as pd
from pyhive import presto
connection = presto.connect(host='10.0.0.34', port=8080)
cur = connection.cursor()
cur.execute("SELECT * FROM local.default.report_review_20191230_v1 where campaign_name like '%만두%'")
df = pd.DataFrame(cur.fetchall())
print(df.head())
'BigData' 카테고리의 다른 글
빅데이터 이용 사례 : 카드사 (0) | 2019.12.10 |
---|---|
Apache Superset (0) | 2019.12.03 |
Hive (0) | 2019.11.22 |
CTE 활용 (0) | 2019.11.14 |
Mysql sleep session 정리 (0) | 2019.11.14 |