TDX 運輸資料流通服務API使用教學

TDX 運輸資料流通服務API使用教學

TDX 運輸資料流通服務API使用教學

簡易使用API

先透過官方提供的API使用說明

把拿到的id跟Key依照額外檔案或是寫在程式碼中,方便後續使用,成果

#匯入程式碼,並且只需要給定api 名稱就可以呼叫from auth_TDX import get_api_return_dfapi_use = '/v3/Rail/TRA/Station' #這邊使用拿到車站名稱train_basic_info = get_api_return_df(api_use)train_basic_info.head()

拿到車站資料對應的車站號碼

拿到的資訊第一層主要是基本時間等資訊,細部資訊都放在”Stations”裡面,所以我們需要再針對這個資料做整理,”Stations”裡面是一個JSON format,這邊我們實際需要的是車站名稱(StationName)的中文(Zh_tw) 以及對應的車站ID(StationID)

目標是做出一個對應表,讓我們可以由中文找到車站ID,程式如下

def gen_train_hash_from_df(data)->dict:    """    Args:        data (_type_): API回傳的車站資料    Raises:        ValueError: 如果遇到異常,就反應錯誤    Returns:        dict: 做好的hash map    """    data = {}    try:        for i, row in train_basic_info.iterrows():            station_name = row['Stations']['StationName']['Zh_tw']            station_id = row['Stations']['StationID']            data[station_name] = station_id    except:        raise ValueError('Get unexpected data type.')    return data

拿到給定啟始結束車站以及日期的發車資訊

由上面我們已經拿到對應表,現在要去做一個給定起始跟結束車站還有日期,拿到發車資訊的程式

1.打字時”臺台”都是我們常用的方法,但字典只接受”臺”,我們可以簡單透過replace來修正 (get_train_num_from_train_name)

2.在使用下表API時,規定的日期格式為yyyy-MM-dd,因此寫一個可以取得規定方法的日期格式(get_today_format)

image.png

3.打包以上的程式,基本流程為使用者給定”起始”,”結束”車站,以及日期(optional),如果沒給日期就會預設今日,最後會回傳呼叫API後的df (get_train_list_from_ini_station_fin_station_date)

def get_train_num_from_train_name(train_name:str, hash:dict=train_hash)->str:    train_name_new = train_name.replace('台','臺')    return hash[train_name_new]def get_today_format()->str:    import datetime    current_date = datetime.date.today()    return current_date.strftime("%Y-%m-%d")def get_train_list_from_ini_station_fin_station_date(ini_station:str, fin_station:str, train_date:str=get_today_format())->pd.DataFrame:    """_summary_    Args:        ini_station (str): input as a two word station        fin_station (str): input as a two word station        train_date (str, optional): _description_. Defaults to get_today_format(), format yyyy-MM-dd    Returns:        pd.DateFrame: Data from API    """    OriginStationID = get_train_num_from_train_name(ini_station)    DestinationStationID = get_train_num_from_train_name(fin_station)    test = f'/v3/Rail/TRA/DailyTrainTimetable/OD/Inclusive/{OriginStationID}/to/{DestinationStationID}/{train_date}'    train_time_table = get_api_return_df(test)    return train_time_tableget_train_data = get_train_list_from_ini_station_fin_station_date('后里','台中') #測試程式

觀察資料,整理起訖車站的資訊

以下是需要的資訊的位置

temp[‘TrainInfo’][‘TrainNo’] # this is train number 車次號碼

temp[‘TrainInfo’][‘StartingStationID’] #this is the answer of whether train start is station start 該車次的發車車站

temp[‘StopTimes’][0][‘StationID’] #here is the stop station ID 該車次會停的車站ID

temp[‘StopTimes’][0][‘ArrivalTime’] #here is the stop station time 該車次在某車站的停車時間

後續把觀察到的資訊去做包裝

1.每個車站的停車資訊會放在”StopTimes”,因此給定要查的車站去回傳一個車站:抵達時間 的map (get_arrivetime_by_strain_info)

2.每一個班次中的資料拆開來處理,目標為取得班次中,指定的起迄車站發車時間,班次號碼還有起始車站是否為發車車站(get_train_sum_info_from_train_dict)

3.for loop去把一天中的車次都做同樣分析並回傳(gen_train_api_info_to_df)

def get_arrivetime_by_strain_info(train_info:list, train_list:list)->dict:    """_summary_    Args:        train_info (list): this should be train API return dataframe ['TrainTimetables'][i]['StopTimes'], each element in list is a dict        train_list (list): list of train id to get time    Returns:        dict: train start time by input train list sequence, key is train id and value is time    """    res = {}    for i in range(len(train_info)):        station_id = train_info[i]['StationID']        if station_id == train_list[0]:            res['Start_Station'] = train_info[i]['ArrivalTime']        elif station_id == train_list[1]:            res['Arrive_Station'] = train_info[i]['ArrivalTime']    return resdef get_train_sum_info_from_train_dict(train_info:dict, ini_station:str, fin_station:str):    """_summary_    Args:        train_info (dict): data from API        ini_station (str): in chinese        fin_station (str): in chinese    """    train_num = train_info['TrainInfo']['TrainNo']    train_start_loc = train_info['TrainInfo']['StartingStationID']    OriginStationID = get_train_num_from_train_name(ini_station)    DestinationStationID = get_train_num_from_train_name(fin_station)    time_hash = get_arrivetime_by_strain_info(train_info['StopTimes'], [OriginStationID, DestinationStationID])    time_hash['Train_ID'] = train_num    time_hash['Is_init_station'] = (train_start_loc==OriginStationID)    return time_hashdef gen_train_api_info_to_df(data:pd.DataFrame, ini_station:str, fin_station:str)->pd.DataFrame:    summary_train_status = []    for i in data['TrainTimetables']:        summary_train_status.append(get_train_sum_info_from_train_dict(i, ini_station,fin_station))    return pd.DataFrame(summary_train_status)check_data = gen_train_api_info_to_df(get_train_data, '后里','台中') #測試程式check_data.tail()

如此一來,就可以給定起訖車站以及日期去拿到對應的資料,台鐵APP許多功能也都是這些API呼叫的結果,有這樣的平台就可以讓常搭車卻覺得臺鐵APP仍有不足功能時,可以客製化自己的需求。

最後還是要再次感謝 TDX 提供這樣方便的平台讓民眾使用,讓大家不用花大量時間去爬資料,提高大家的效率並激發更多應用。

程式碼同步上架於Github

Comments

Loading comments…

Leave a Comment